SlideShare a Scribd company logo
Correlation and Regression
Dr. Yogesh A. Garde
Assistant Professor (Agril. Statistics)
Introduction
• Univariate analysis: The study related to the
characteristics of only variable such as height,
weight, ages, marks, wages, etc.
• Bivariate analysis: The statistical Analysis related to
the study of the relationship between two
variables.
• Univariate population: A population that is
characterized by a single variable, e.g. population
of height of students
• Bivariate population: When two variables are
simultaneously studied in a single population, e.g.
the height and weight of the students
Simple Correlation
• A measure of degree or extent of linear relationship
between two variable X and Y.
• Correlation is an analysis of the co-variation between
two or more variables
• It is a unit free measure
• Uses of Correlation:
• It is used in physical and social sciences.
• It is useful for economists to study the relationship between
variables like price, quantity etc. Businessmen estimates
costs, sales, price etc. using correlation.
• It is helpful in measuring the degree of relationship between
the variables like income and expenditure, price and supply,
supply and demand etc.
• Sampling error can be calculated.
• It is the basis for the concept of regression
Scatter Diagram:
• The simplest method of studying the relationship
between two variables diagrammatically
• we cannot get the exact degree or correlation
between the two variables (Range: -1 to +1)
Correlation and Regression.pptx
Types of Correlation:
• Positive and negative Correlation:
• Linear and non-linear(Curvi-linear) Correlation
• Partial and total Correlation
• Simple and Multiple Correlation
• Positive: If the two variables tend to move together in
the same direction (increase/decrease)
• Negative: If the two variables tend to move together in
the opposite direction (one increase other decrease)
• Linear: If the ratio of change between the two variables
is a constant
• Non linear: If the amount of change in the one variable
does not shows a constant ratio of the amount of
change in the other
• Simple: study only two variables
• Multiple: study more than two variables
simultaneously
• Partial: study of two variables excluding some other
variable
• Total: all variable and facts are taken into account
Measures of Correlation
• Scatter diagram method/Graphic method
• Algebraic method: Karl Pearson’s coefficient of
correlation
• Karl pearson, a great biometrician and statistician,
suggested a mathematical method for measuring
the magnitude of linear relationship between the
two variables
For grouped data, this
formula can write using letter
u and v;
where u= (x-a)/h
And v= (Y-b)/k
Assumptions for Correlation Coefficient
• Assumption of Linearity
• Variables being used to know correlation coefficient must be
linearly related. You can see the linearity of the variables
through scatter diagram.
• Assumption of Normality
• Both variables under study should follow Normal distribution.
They should not be skewed in either the positive or the
negative direction.
• Assumption of Cause and Effect Relationship
• There should be cause and effect relationship between both
variables, for example, Heights and Weights of children,
Demand and Supply of goods, etc. When there is no cause
and effect relationship between variables then correlation
coefficient should be zero. If it is non zero then correlation is
termed as chance correlation or spurious correlation. For
example, correlation coefficient between:
1. Weight and income of a person over periods of time;
2. Rainfall and literacy in a state over periods of time
Properties of Correlation
• Correlation coefficient lies between –1 and +1
• r = +1 perfect positive correlation and r = -1 perfect
negative correlation between the variables.
• ‘r’ is independent of change of origin and scale.
• If X and Y are two independent variables then
correlation coefficient between X and Y is zero, i.e.
Corr(x, y)=0
• It is a pure number independent of units of
measurement.
• Correlation coefficient is the geometric mean of two
regression coefficients
• The correlation coefficient of x and y is symmetric. rxy =
ryx
Example:
Exercise:
Spearman’s Rank Correlation:
• Edward Spearman in 1904
• No assumption about the parameters of the population is
made. This method is based on ranks.
• It is useful to study the qualitative measure of attributes like
honesty, color, beauty, intelligence, character, morality etc.
It is also denoted by ‘ρ’. The value of
‘r’ lies between -1 and +1. If r = +1,
there is complete agreement in order
of ranks and the direction of ranks is
also same. If r = -1, then there is
complete disagreement in order of
ranks and they are in opposite
directions.
Example:
Thus there is a
positive
association
between ranks of
Statistics and
Mathematics.
IF Tied/Repeated ranks Corrected factor
1.
2.
Standard error and Probable error of r
• Standard error of r or S.E. (r) =
1−𝑟2
𝑛
• Probable error of r or P.E. (r) = 0.6745 x
1−𝑟2
𝑛
• S.E. and P.E. are used as measures the reliability of
coefficient correlation (r)
Coefficient of concurrent deviations
• A very simple and casual method of finding correlation
when we are not serious about the magnitude of the two
variables is the application of concurrent deviations.
• This method involves in attaching a positive sign for a x-
value (except the first) if this value is more than the
previous value and assigning a negative value if this value is
less than the previous value.
• Coefficient of concurrent deviations also lies between –1 to
+1,
If (2c–m) > 0, then we take the positive sign both inside and outside the
radical sign and if (2c–m) < 0, we are to consider the negative sign both
inside and outside the radical sign.
m = number of pairs of
deviations
c = no. of positive signs in the
product of deviation column
Example:
m = number of pairs of deviations, m = 7
c = No. of positive signs in the product of deviation column, c = 2
Exercise
Simple linear regression
• Regression: is the measure of the average
relationship between two or more variables in
terms of the original units of the data.
• It is functional relationship between a dependent
variable, Y with one or more independent variables,
X is called Regression equation.
• The term regression is coined by Sir Francis Galton
• Types of Regression:
a) Simple and Multiple
b) Linear and Non –Linear
c) Total and Partial
• Simple: In case of simple relationship only two variables are
considered
• Multiple: more than two variables are involved. On this
while one variable is a dependent variable the remaining
variables are independent ones.
• Linear: The linear relationships are based on straight-line
trend, the equation of which has no-power higher than one.
• Non-linear: In the case of non-linear relationship curved
trend lines are derived. The equations of these are
parabolic.
• Total: all the important variables are considered. normally,
they take the form of a multiple relationships because most
economic and business phenomena are affected by
multiplicity of cases.
• Partial: In the case of partial relationship one or more
variables are considered, but not all, thus excluding the
influence of those not found relevant for a given purpose.
Linear Regression Equation:
• If two variables have linear relationship then as the
independent variable (X) changes, the dependent
variable (Y) also changes. These equations show best
estimate of one variable for the known value of the
other. The equations are linear.
• Linear regression equation of Y on X is
Y = a + byxX + e
Where: a= intercept;
byx =slope of the line/regression
coefficient;
e= is error distributed as N(0,σ2)
Principle of ‘Least Squares’
the constants ‘a’ and ‘b’ can be
estimated with by applying the `least
squares method’.
It can be written as: Where, r= coeff. of correlation
σ y = SD of y and σ x = SD of x
2
Is minimized by partially differentiating it w.r.t. a and b
respectively and equating them to zero.
Assumptions of Linear Regression
• The X’s are non random or fixed constants. For
example, it may apply 40, 60, 80 and 100 kg of N2
per ha to 10 plots each. Then it may observe the
crop yield (Y) corresponding to each of these
selected level (X).
• At each fixed value of X, the corresponding values
of Y have a Normal distribution about theoretical
mean.
• For any given x, the variance of Y is the same or
homoscedastic.
• The y’s observed at different value of X are
completely independent.
Properties of Regression Coefficient:
Correlation and Regression.pptx
Example:
Exercise:
Test of significance of coefficient of correlation
• Null hypothesis H0: ρ = 0
• Alternative hypothesis H1: ρ ≠ 0
• If n is very large and in fact ρ ≠ 0 then, test a sample
correlation coefficient against a population
correlation coefficient
• 𝑍 =
𝑟−𝜌
1−𝑟2
𝑛 − 1
• The analysis of variance technique can be used;
• 𝐹 =
𝑟2
(1−𝑟)2 𝑛 − 2
With (n-2) d.f.
With (1, n-2) d.f.
Test of significance of coefficient of regression
• Test of significance of byx
𝑡 =
𝑏𝑦𝑥−𝛽
𝑆.𝐸.(𝑏𝑦𝑥)
where 𝛽 is the population regression coefficient
𝑆. 𝐸. 𝑏𝑦𝑥 =
1
𝑛
(𝑦𝑖 − 𝑦𝑖(𝑒𝑠𝑡))2
(𝑛 − 2) (𝑥𝑖− 𝑥)2
By solving, 𝑆. 𝐸. 𝑏𝑦𝑥 =
𝑦𝑖
2−𝑏𝑦𝑥
2 . 𝑥𝑖
2
(𝑛−2) 𝑥𝑖
2
• Que. Differentiate Correlation and regression
With (n-2) d.f.
• Test of significance of byx using ANOVA
• F= v1/v2 with d.f. (1, n-2)
S.V d.f. S.S. M.S.
Regression
Error
1
N-2
( 𝑥𝑦)2
𝑥2
𝑦2
−
(𝑥𝑦)2
𝑥2
𝑣1 =
( 𝑥𝑦)2
𝑥2
𝑣2 =
1
𝑛 − 2
𝑦2
−
(𝑥𝑦)2
𝑥2
Total N-1
𝑦2
Thank You

More Related Content

PPTX
Spearman Rank
i-study-co-uk
 
PPTX
Linear correlation
Tech_MX
 
PPTX
Properties of correlation coefficient
Raja Adapa
 
PPTX
Correlation analysis
Anil Pokhrel
 
PDF
Correlation Analysis
Birinder Singh Gulati
 
PPTX
Cramer row inequality
VashuGupta8
 
PPTX
Mathematical Expectation And Variance
DataminingTools Inc
 
Spearman Rank
i-study-co-uk
 
Linear correlation
Tech_MX
 
Properties of correlation coefficient
Raja Adapa
 
Correlation analysis
Anil Pokhrel
 
Correlation Analysis
Birinder Singh Gulati
 
Cramer row inequality
VashuGupta8
 
Mathematical Expectation And Variance
DataminingTools Inc
 

What's hot (20)

PPTX
correlation and regression
Keyur Tejani
 
PPTX
Presentation On Regression
alok tiwari
 
ODP
Multiple linear regression
James Neill
 
PDF
Correlation in statistics
Nadeem Uddin
 
PPTX
Probability Density Function (PDF)
AakankshaR
 
PPTX
Transformation of variables
tripurajyothireddy
 
ODP
Correlation
James Neill
 
PPTX
What is a partial correlation?
Ken Plummer
 
PDF
Regression Analysis
nadiazaheer
 
PPTX
Probability Theory
Parul Singh
 
PPTX
correlation.final.ppt (1).pptx
ChieWoo1
 
PPTX
Correlation
Umme habiba
 
PDF
Properties of coefficient of correlation
Nadeem Uddin
 
PPTX
Regression analysis
saba khan
 
PDF
Study of Correlation
Vikas Kumar Singh
 
PPTX
Partial correlation
DwaitiRoy
 
PPTX
R square vs adjusted r square
Akhilesh Joshi
 
PPTX
Correlation and regression
Mohit Asija
 
PPTX
Time Series Decomposition
chandan kumar singh
 
PPTX
Conditional probability, and probability trees
Global Polis
 
correlation and regression
Keyur Tejani
 
Presentation On Regression
alok tiwari
 
Multiple linear regression
James Neill
 
Correlation in statistics
Nadeem Uddin
 
Probability Density Function (PDF)
AakankshaR
 
Transformation of variables
tripurajyothireddy
 
Correlation
James Neill
 
What is a partial correlation?
Ken Plummer
 
Regression Analysis
nadiazaheer
 
Probability Theory
Parul Singh
 
correlation.final.ppt (1).pptx
ChieWoo1
 
Correlation
Umme habiba
 
Properties of coefficient of correlation
Nadeem Uddin
 
Regression analysis
saba khan
 
Study of Correlation
Vikas Kumar Singh
 
Partial correlation
DwaitiRoy
 
R square vs adjusted r square
Akhilesh Joshi
 
Correlation and regression
Mohit Asija
 
Time Series Decomposition
chandan kumar singh
 
Conditional probability, and probability trees
Global Polis
 
Ad

Similar to Correlation and Regression.pptx (20)

PDF
Correlation and Regression
Dr. Tushar J Bhatt
 
PPTX
Module 4- Correlation & Regression Analysis.pptx
Kishlay Kumar
 
PDF
Introduction to correlation and regression analysis
Farzad Javidanrad
 
PPTX
Statistics ppt
PMuruganBalaMurugan
 
PPTX
Correlation and regression
ANCYBS
 
PPTX
Correletion.pptx
swapniltirmanwar
 
PPTX
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
Manjulasingh17
 
PPT
Correlation
Anish Maman
 
PPTX
Regression and correlation in statistics
iphone4s4
 
PPTX
Stat 1163 -correlation and regression
Khulna University
 
PPTX
UNIT 4.pptx
Mrunmayee Manjari
 
PPTX
Correlation and Regression ppt
Santosh Bhaskar
 
PPTX
Correlation and Regression Analysis.pptx
rockys3d
 
PPTX
Regression &amp; correlation coefficient
MuhamamdZiaSamad
 
PPTX
Biostatistics - Correlation explanation.pptx
UVAS
 
PPTX
Correlation and regression
Sakthivel R
 
PDF
Correlation and regression
NANDINI SRIVASTAVA
 
PPT
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Neeraj Bhandari
 
PDF
CORRELATION ANALYSIS NOTES.pdf
LSHERLEYMARY
 
PPTX
Regression analysis and Correlation.pptx
PreetiAggarwal52
 
Correlation and Regression
Dr. Tushar J Bhatt
 
Module 4- Correlation & Regression Analysis.pptx
Kishlay Kumar
 
Introduction to correlation and regression analysis
Farzad Javidanrad
 
Statistics ppt
PMuruganBalaMurugan
 
Correlation and regression
ANCYBS
 
Correletion.pptx
swapniltirmanwar
 
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
Manjulasingh17
 
Correlation
Anish Maman
 
Regression and correlation in statistics
iphone4s4
 
Stat 1163 -correlation and regression
Khulna University
 
UNIT 4.pptx
Mrunmayee Manjari
 
Correlation and Regression ppt
Santosh Bhaskar
 
Correlation and Regression Analysis.pptx
rockys3d
 
Regression &amp; correlation coefficient
MuhamamdZiaSamad
 
Biostatistics - Correlation explanation.pptx
UVAS
 
Correlation and regression
Sakthivel R
 
Correlation and regression
NANDINI SRIVASTAVA
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Neeraj Bhandari
 
CORRELATION ANALYSIS NOTES.pdf
LSHERLEYMARY
 
Regression analysis and Correlation.pptx
PreetiAggarwal52
 
Ad

Recently uploaded (20)

PDF
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
PPTX
Kanban Cards _ Mass Action in Odoo 18.2 - Odoo Slides
Celine George
 
PPTX
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PPTX
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
PPTX
CDH. pptx
AneetaSharma15
 
PPTX
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
PPTX
How to Manage Leads in Odoo 18 CRM - Odoo Slides
Celine George
 
PDF
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
DOCX
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
DOCX
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
PDF
BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...
Nguyen Thanh Tu Collection
 
PPTX
How to Apply for a Job From Odoo 18 Website
Celine George
 
PPTX
Basics and rules of probability with real-life uses
ravatkaran694
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PPTX
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
PPTX
Software Engineering BSC DS UNIT 1 .pptx
Dr. Pallawi Bulakh
 
PPTX
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
PPTX
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
DOCX
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
DOCX
Action Plan_ARAL PROGRAM_ STAND ALONE SHS.docx
Levenmartlacuna1
 
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
Kanban Cards _ Mass Action in Odoo 18.2 - Odoo Slides
Celine George
 
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
CDH. pptx
AneetaSharma15
 
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
How to Manage Leads in Odoo 18 CRM - Odoo Slides
Celine George
 
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...
Nguyen Thanh Tu Collection
 
How to Apply for a Job From Odoo 18 Website
Celine George
 
Basics and rules of probability with real-life uses
ravatkaran694
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
Software Engineering BSC DS UNIT 1 .pptx
Dr. Pallawi Bulakh
 
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
Action Plan_ARAL PROGRAM_ STAND ALONE SHS.docx
Levenmartlacuna1
 

Correlation and Regression.pptx

  • 1. Correlation and Regression Dr. Yogesh A. Garde Assistant Professor (Agril. Statistics)
  • 2. Introduction • Univariate analysis: The study related to the characteristics of only variable such as height, weight, ages, marks, wages, etc. • Bivariate analysis: The statistical Analysis related to the study of the relationship between two variables. • Univariate population: A population that is characterized by a single variable, e.g. population of height of students • Bivariate population: When two variables are simultaneously studied in a single population, e.g. the height and weight of the students
  • 3. Simple Correlation • A measure of degree or extent of linear relationship between two variable X and Y. • Correlation is an analysis of the co-variation between two or more variables • It is a unit free measure • Uses of Correlation: • It is used in physical and social sciences. • It is useful for economists to study the relationship between variables like price, quantity etc. Businessmen estimates costs, sales, price etc. using correlation. • It is helpful in measuring the degree of relationship between the variables like income and expenditure, price and supply, supply and demand etc. • Sampling error can be calculated. • It is the basis for the concept of regression
  • 4. Scatter Diagram: • The simplest method of studying the relationship between two variables diagrammatically • we cannot get the exact degree or correlation between the two variables (Range: -1 to +1)
  • 6. Types of Correlation: • Positive and negative Correlation: • Linear and non-linear(Curvi-linear) Correlation • Partial and total Correlation • Simple and Multiple Correlation
  • 7. • Positive: If the two variables tend to move together in the same direction (increase/decrease) • Negative: If the two variables tend to move together in the opposite direction (one increase other decrease) • Linear: If the ratio of change between the two variables is a constant • Non linear: If the amount of change in the one variable does not shows a constant ratio of the amount of change in the other • Simple: study only two variables • Multiple: study more than two variables simultaneously • Partial: study of two variables excluding some other variable • Total: all variable and facts are taken into account
  • 8. Measures of Correlation • Scatter diagram method/Graphic method • Algebraic method: Karl Pearson’s coefficient of correlation
  • 9. • Karl pearson, a great biometrician and statistician, suggested a mathematical method for measuring the magnitude of linear relationship between the two variables For grouped data, this formula can write using letter u and v; where u= (x-a)/h And v= (Y-b)/k
  • 10. Assumptions for Correlation Coefficient • Assumption of Linearity • Variables being used to know correlation coefficient must be linearly related. You can see the linearity of the variables through scatter diagram. • Assumption of Normality • Both variables under study should follow Normal distribution. They should not be skewed in either the positive or the negative direction. • Assumption of Cause and Effect Relationship • There should be cause and effect relationship between both variables, for example, Heights and Weights of children, Demand and Supply of goods, etc. When there is no cause and effect relationship between variables then correlation coefficient should be zero. If it is non zero then correlation is termed as chance correlation or spurious correlation. For example, correlation coefficient between: 1. Weight and income of a person over periods of time; 2. Rainfall and literacy in a state over periods of time
  • 11. Properties of Correlation • Correlation coefficient lies between –1 and +1 • r = +1 perfect positive correlation and r = -1 perfect negative correlation between the variables. • ‘r’ is independent of change of origin and scale. • If X and Y are two independent variables then correlation coefficient between X and Y is zero, i.e. Corr(x, y)=0 • It is a pure number independent of units of measurement. • Correlation coefficient is the geometric mean of two regression coefficients • The correlation coefficient of x and y is symmetric. rxy = ryx
  • 14. Spearman’s Rank Correlation: • Edward Spearman in 1904 • No assumption about the parameters of the population is made. This method is based on ranks. • It is useful to study the qualitative measure of attributes like honesty, color, beauty, intelligence, character, morality etc. It is also denoted by ‘ρ’. The value of ‘r’ lies between -1 and +1. If r = +1, there is complete agreement in order of ranks and the direction of ranks is also same. If r = -1, then there is complete disagreement in order of ranks and they are in opposite directions.
  • 15. Example: Thus there is a positive association between ranks of Statistics and Mathematics.
  • 16. IF Tied/Repeated ranks Corrected factor
  • 17. 1. 2.
  • 18. Standard error and Probable error of r • Standard error of r or S.E. (r) = 1−𝑟2 𝑛 • Probable error of r or P.E. (r) = 0.6745 x 1−𝑟2 𝑛 • S.E. and P.E. are used as measures the reliability of coefficient correlation (r)
  • 19. Coefficient of concurrent deviations • A very simple and casual method of finding correlation when we are not serious about the magnitude of the two variables is the application of concurrent deviations. • This method involves in attaching a positive sign for a x- value (except the first) if this value is more than the previous value and assigning a negative value if this value is less than the previous value. • Coefficient of concurrent deviations also lies between –1 to +1, If (2c–m) > 0, then we take the positive sign both inside and outside the radical sign and if (2c–m) < 0, we are to consider the negative sign both inside and outside the radical sign. m = number of pairs of deviations c = no. of positive signs in the product of deviation column
  • 20. Example: m = number of pairs of deviations, m = 7 c = No. of positive signs in the product of deviation column, c = 2 Exercise
  • 21. Simple linear regression • Regression: is the measure of the average relationship between two or more variables in terms of the original units of the data. • It is functional relationship between a dependent variable, Y with one or more independent variables, X is called Regression equation. • The term regression is coined by Sir Francis Galton • Types of Regression: a) Simple and Multiple b) Linear and Non –Linear c) Total and Partial
  • 22. • Simple: In case of simple relationship only two variables are considered • Multiple: more than two variables are involved. On this while one variable is a dependent variable the remaining variables are independent ones. • Linear: The linear relationships are based on straight-line trend, the equation of which has no-power higher than one. • Non-linear: In the case of non-linear relationship curved trend lines are derived. The equations of these are parabolic. • Total: all the important variables are considered. normally, they take the form of a multiple relationships because most economic and business phenomena are affected by multiplicity of cases. • Partial: In the case of partial relationship one or more variables are considered, but not all, thus excluding the influence of those not found relevant for a given purpose.
  • 23. Linear Regression Equation: • If two variables have linear relationship then as the independent variable (X) changes, the dependent variable (Y) also changes. These equations show best estimate of one variable for the known value of the other. The equations are linear. • Linear regression equation of Y on X is Y = a + byxX + e Where: a= intercept; byx =slope of the line/regression coefficient; e= is error distributed as N(0,σ2) Principle of ‘Least Squares’ the constants ‘a’ and ‘b’ can be estimated with by applying the `least squares method’.
  • 24. It can be written as: Where, r= coeff. of correlation σ y = SD of y and σ x = SD of x 2 Is minimized by partially differentiating it w.r.t. a and b respectively and equating them to zero.
  • 25. Assumptions of Linear Regression • The X’s are non random or fixed constants. For example, it may apply 40, 60, 80 and 100 kg of N2 per ha to 10 plots each. Then it may observe the crop yield (Y) corresponding to each of these selected level (X). • At each fixed value of X, the corresponding values of Y have a Normal distribution about theoretical mean. • For any given x, the variance of Y is the same or homoscedastic. • The y’s observed at different value of X are completely independent.
  • 26. Properties of Regression Coefficient:
  • 30. Test of significance of coefficient of correlation • Null hypothesis H0: ρ = 0 • Alternative hypothesis H1: ρ ≠ 0 • If n is very large and in fact ρ ≠ 0 then, test a sample correlation coefficient against a population correlation coefficient • 𝑍 = 𝑟−𝜌 1−𝑟2 𝑛 − 1 • The analysis of variance technique can be used; • 𝐹 = 𝑟2 (1−𝑟)2 𝑛 − 2 With (n-2) d.f. With (1, n-2) d.f.
  • 31. Test of significance of coefficient of regression • Test of significance of byx 𝑡 = 𝑏𝑦𝑥−𝛽 𝑆.𝐸.(𝑏𝑦𝑥) where 𝛽 is the population regression coefficient 𝑆. 𝐸. 𝑏𝑦𝑥 = 1 𝑛 (𝑦𝑖 − 𝑦𝑖(𝑒𝑠𝑡))2 (𝑛 − 2) (𝑥𝑖− 𝑥)2 By solving, 𝑆. 𝐸. 𝑏𝑦𝑥 = 𝑦𝑖 2−𝑏𝑦𝑥 2 . 𝑥𝑖 2 (𝑛−2) 𝑥𝑖 2 • Que. Differentiate Correlation and regression With (n-2) d.f.
  • 32. • Test of significance of byx using ANOVA • F= v1/v2 with d.f. (1, n-2) S.V d.f. S.S. M.S. Regression Error 1 N-2 ( 𝑥𝑦)2 𝑥2 𝑦2 − (𝑥𝑦)2 𝑥2 𝑣1 = ( 𝑥𝑦)2 𝑥2 𝑣2 = 1 𝑛 − 2 𝑦2 − (𝑥𝑦)2 𝑥2 Total N-1 𝑦2