SlideShare a Scribd company logo
Stat-1163: Statistics in Environmental Science
Section B,
Chapter: Correlation and Regession
Md. Menhazul Abedin
Lecturer
Statistics Discipline
Khulna University, Khulna-9208
Email: menhaz70@gmail.com
Correlation
Correlation
• Variable
– Independent variable (x)
– Dependent variable (y)
• They are related proportinaly or
anti proportionally obviously
• Let see an example
• Cut your coat (y) according to your cloths(x)
Correlation
YorX
XorY
Anti
proportional
Negative
Correlation
Opposite
Direction
Y
X
Proportional
Positive
Correlation
Same
Direction
Y
XProportional
Positive
Correlation
Same
Direction
Indicision
No
Correlation
No direction
X Y
Correlation
Correlation
• Scatter plot: A scatterplot is a useful
summary of a set of bivariate data (two
variables), usually drawn before working out a
linear correlation coefficient or fitting a
regression line. It gives a good visual picture of
the relationship between the two variables,
and aids the interpretation of the correlation
coefficient or regression model.
Correlation
• OKKK....... How to measue correlation ??
– Correlation coefficient
• Correlation coefficient : Correlation coefficient is a
quantitative measure of the direction and strength of
linear relationship between two numerically
measured variables.
Correlation
• Correlation coefficient can be defined as
Correlation
• Assumptions:
– Both variables are measured on an interval or
ratio scale
– The variable follow normal distribution
– The relationship betweeen variables is linear
– The relationship is of adequate size to assume
normality
Correlation
Interpretation: The value of r is always
between +1 and –1.
• Exactly –1. A perfect downhill (negative) linear
relationship
• –0.70. A strong downhill (negative) linear
relationship
• –0.50. A moderate downhill (negative)
relationship
• –0.30. A weak downhill (negative) linear
relationship
Correlation
• 0. No linear relationship
• +0.30. A weak uphill (positive) linear relationship
• +0.50. A moderate uphill (positive) relationship
• +0.70. A strong uphill (positive) linear relationship
• Exactly +1. A perfect uphill (positive) linear
relationship
Correlation
• Misinterpretations:
– Does not demonstrate the causal relationship
between two variables
– R=0 does not mean that X and Y are not related,
but that they are not linearly related.
– Two variable can have a strong association but a
small correlation coefficient r, if the relationship is
not linear.
Correlation
• Properties:
– Coefficient of Correlation lies between -1 and +1:
The coefficient of correlation cannot take value less than -
1 or more than one +1. Symbolically,
-1<=r<= + 1 or | r | <1
– Coefficients of Correlation are independent of Change of
Origin:
This property reveals that if we subtract any constant
from all the values of X and Y, it will not affect the
coefficient of correlation.
– Coefficients of Correlation possess the property of
symmetry:
The degree of relationship between two variables is
symmetric .
Correlation
– Coefficient of Correlation is independent of
Change of Scale
This property reveals that if we divide or
multiply all the values of X and Y, it will not
affect the coefficient of correlation.
– Co-efficient of correlation measures only linear
correlation between X and Y.
– If two variables X and Y are independent,
coefficient of correlation between them will be
zero.
Example-1
Example-1
Example-1
Example-1
Example-2
• Exercise:
https://www.mathsisfun.com/data/correlation.html
Draw a scatter plot
Find corrrelation
coefficient
Example-3
• Correlation of Gestational Age and Birth Weight
Example-3
Example-3
Different correlations
• Sprearman ranks correlation
• Spurious correlations
• Intraclass correlation
• Tetrachoric correlation
• Point bi-serial correlation
• Bi-serial correlation
Further study
• Proof of properties
• Distribution of correlatio coefficient
• Test of correlation coefficient
– Zero correlation test
– Non-zero correlation test
– Paired correlation test
Regression
Learning Objectives
1. Describe the Linear Regression Model
2. State the Regression Modeling Steps
3. Explain Ordinary Least Squares
4. Compute Regression Coefficients
5. Predict Response Variable
What is a Model?
1. Representation of
Some Phenomenon
Non-Math/Stats Model
What is a Math/Stats Model?
1. Often Describe Relationship between Variables
2. Types
- Deterministic Models (no randomness)
- Probabilistic Models (with randomness)
Example-1
Example-1
Do you think days with
temparature 100
,
130, 200, 250 𝐶 what will
be the sales scenario ?
This is a job of foreteller
But we are not that.
We have satatistics
Example-1
1. Need staight line
2. There are Infinite
straight line
3. Which line represent
the phenomena is
model
4. We have to modeling
𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖
𝑦𝑖: Dependent variable
𝑥𝑖: Independent variable
𝜖𝑖: Random error
𝛼: Intercept
𝛽: Regression coefficient
Minimize error and find
out 𝛼 and 𝛽 for modeling
(model fitting)
Types of
Regression Models
Regression
Model
Multiple
1 dependent
2+ explanatory
variable
Linear Non-linear
Multivariate
2+ dependent variable
No restriction on
explanatory variable
Simple
1 dependent
1 explanatory
variable
Simple linear regression model
• 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜖𝑖
– 𝑦𝑖: Dependent variable (known)
– 𝑥𝑖: Independent variable (known)
– 𝜖𝑖: Random error
– 𝛽0: Intercept (unknown)
– 𝛽1: Regression coefficient (unknown)
• Minimize the error and find out 𝛽0 and 𝛽1 for
modeling (model fitting)
Simple linear regression line
• 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 (regression line/ prediction equation)
– 𝑦𝑖: Fitted values
– 𝑥𝑖: Independent variable (known)
– 𝛽0: Estimated intercept (general mean)
– 𝛽1: Estimated regression coefficient (changing
rate/ slope)
• How to find 𝛽0 and 𝛽1 ????
Variables name
Coefficient Equations
• Prediction equation
• Sample slope
•
• Sample Y - intercept
ii xy 10
ˆˆˆ  
  
  
 

21
ˆ
xx
yyxx
SS
SS
i
ii
xx
xy

xy 10
ˆˆ  
How to find 𝛽0 and 𝛽1 ????
• SLS/ OLS (Simple/ ordinary Least square)
• WLS (Weighted Least square)
• GLS (Generalized Least square)
Least Squares
• 1. ‘Best Fit’ Means Difference Between Actual Y
Values & Predicted Y Values Are a Minimum. But
Positive Differences Off-Set Negative. So square
errors!
𝑖=1
𝑛
(𝑦𝑖 − 𝑦𝑖)2
=
𝑖=1
𝑛
𝜀𝑖
2
• 2. LS Minimizes the Sum of the Squared
Differences (errors) (SSE)
Assumptions
1. The regression model is linear in parameters.
2. The regression model is correctly specified.
3. X’s are fixed over repeated sample.
4. Errors are normaly distributed with mean zero
and fixed variance i.e. 𝑒𝑖~𝑁(0, 𝜎2
).
5. No perfect multicollinearity.
6. No autocorrelation of residuals
Derivation of Parameters
• Least Squares (L-S):
Minimize squared error
 
22
0 1
1 1
n n
i i i
i i
y x  
 
   
 
 
22
0 1
0 0
0 1
0
2
i i iy x
ny n n x
  
 
 
   
 
 
   
 
xy 10
ˆˆ  
Derivation of Parameters
• Least Squares (L-S):
Minimize squared error
 
 
 
22
0 1
1 1
0 1
1 1
0
2
2
i i i
i i i
i i i
y x
x y x
x y y x x
  
 
 
 
   
 
 
   
    
 


   
     
1
1
1
ˆ
i i i i
i i i i
xy
xx
x x x x y y
x x x x x x y y
SS
SS



  
    

 
 
Derivation of Parameters
• Prediction equation
• Sample slope
•
• Sample Y – intercept
• 0
and 1
are called OLSE (ordinary least square
estimator)
ii xy 10
ˆˆˆ  
  
  
 

21
ˆ
xx
yyxx
SS
SS
i
ii
xx
xy

xy 10
ˆˆ  
Interpretation of Coefficients
• 1. Slope (1)
– Estimated Y changes by 1 for each 1 unit increase
in X
• If 1 = 2, then Y is expected to increase by 2 for each 1
unit increase in X
• 2. Y-Intercept (0
)
– Average value of Y when X = 0
• If 0
= 4, then average Y is expected to be 4
when X Is 0
Example -1
• Consider the data obtained from a chemical process where the yield of the
process is thought to be related to the reaction temperature (see the table
below).
Example -1
• The least square estimates of the regression coefficients can
be obtained for the data in the preceding table as follows:
Example -1
Example -1
• Once the fitted regression line is known, the fitted value
of corresponding to any observed data point can be
calculated. For example, the fitted value corresponding to the
21st observation in the preceding table is:
Example -1
Properties of regression coefficients
1. The correlation coefficient is the geometric mean of two
regression coefficients. Symbolically, it can be expressed
as 𝑟 = (𝛽 𝑥𝑦 𝛽 𝑦𝑥)
1
2
2. Arithmetic mean of both regression coefficients is equal
to or greater than coefficient of correlation.
𝛽 𝑥𝑦+𝛽 𝑦𝑥
2
≥ 𝑟
1. The value of the coefficient of correlation cannot exceed
unity. Therefore, if one of the regression coefficients is
greater than unity, the other must be less than unity.
2. The regression coefficients are independent of the change
of origin, but not of the scale.
Definition
• The regression analysis is a technique of
studying the dependence of one variable
(called dependent variable), on one or more
variables (called explanatory variable), with a
view to estimating or predicting the average
value of the dependent variable in terms of
the known or fixed values of the independent
variables.
Applications
• The regression technique is primarily used to
– Estimate the relationship that exists, on average,
between the dependent variable and explanatory
variable.
– Determine the effect each of the explanatory
variales on the dependent variable, controlling the
effects of all other explanatory variables.
– Predict the value of the dependent variable for a
given value of the explanatory variable.

More Related Content

PPT
Randomized complete block design - Dr. Manu Melwin Joy - School of Management...
manumelwin
 
DOC
Bamboo beema
Shamil Ahmed
 
PPTX
Experimental design in Plant Breeding
DevendraKumar375
 
PDF
Criteria and indicators of sustainable forest management in Montenegro, SN…
Franc Ferlin
 
PPTX
Heritability , genetic advance
Pawan Nagar
 
PPT
Normal forest – growing stock and increment
iqbalforestry
 
PPTX
Joint forest management
Hiya Bhatiya
 
PPTX
Bartlett's test
Irfan Hussain
 
Randomized complete block design - Dr. Manu Melwin Joy - School of Management...
manumelwin
 
Bamboo beema
Shamil Ahmed
 
Experimental design in Plant Breeding
DevendraKumar375
 
Criteria and indicators of sustainable forest management in Montenegro, SN…
Franc Ferlin
 
Heritability , genetic advance
Pawan Nagar
 
Normal forest – growing stock and increment
iqbalforestry
 
Joint forest management
Hiya Bhatiya
 
Bartlett's test
Irfan Hussain
 

What's hot (20)

PPT
18-21 Principles of Least Squares.ppt
BAGARAGAZAROMUALD2
 
PPTX
D-Square statistic
Muhammad Zulqarnain
 
PPT
LINE X TESTER ANALYSIS
HIMANSHI SARASWAT
 
PPTX
Sal forest management
Khulna University,Khulna.
 
PPTX
Agroforestry for climate change mitigation and adaptation
MirFaizan
 
PDF
Siliviculture mt
VidyaDharshini3
 
PPTX
Negative Binomial Distribution
Suchithra Edakunni
 
PPTX
One-way ANOVA for Completely Randomized Design (CRD)
Siti Nur Adila Hamzah
 
PDF
Improved National Forest Inventory Map sampling design
FAO
 
PPTX
Biparental mating design
Lokesh Gour
 
PPTX
Partial correlation
DwaitiRoy
 
DOC
Design of experiments(
Nugurusaichandan
 
PPTX
Forestry extension
Student
 
PDF
Terminology, concept, level of extension education
bp singh
 
PPTX
Forestry economics ppt
Maheshika Rathnayake
 
PDF
Rural urban partnerships - An integrated approach to economic development, by...
OECD Governance
 
PPTX
UN on sustainability [Unfccc]
Shashank Shekhar
 
PPT
Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...
Manoj Sharma
 
PPTX
Completely randomized-design
Riza Joy Palomar
 
18-21 Principles of Least Squares.ppt
BAGARAGAZAROMUALD2
 
D-Square statistic
Muhammad Zulqarnain
 
LINE X TESTER ANALYSIS
HIMANSHI SARASWAT
 
Sal forest management
Khulna University,Khulna.
 
Agroforestry for climate change mitigation and adaptation
MirFaizan
 
Siliviculture mt
VidyaDharshini3
 
Negative Binomial Distribution
Suchithra Edakunni
 
One-way ANOVA for Completely Randomized Design (CRD)
Siti Nur Adila Hamzah
 
Improved National Forest Inventory Map sampling design
FAO
 
Biparental mating design
Lokesh Gour
 
Partial correlation
DwaitiRoy
 
Design of experiments(
Nugurusaichandan
 
Forestry extension
Student
 
Terminology, concept, level of extension education
bp singh
 
Forestry economics ppt
Maheshika Rathnayake
 
Rural urban partnerships - An integrated approach to economic development, by...
OECD Governance
 
UN on sustainability [Unfccc]
Shashank Shekhar
 
Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...
Manoj Sharma
 
Completely randomized-design
Riza Joy Palomar
 
Ad

Similar to Stat 1163 -correlation and regression (20)

PPTX
Correlation and Regression ppt
Santosh Bhaskar
 
PPT
Regression and Co-Relation
nuwan udugampala
 
PDF
Correlation and Regression
Dr. Tushar J Bhatt
 
PPTX
Correlation and Regression.pptx
Jayaprakash985685
 
PPTX
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
Manjulasingh17
 
PPTX
STATISTICAL REGRESSION MODELS
Aneesa K Ayoob
 
PDF
Introduction to correlation and regression analysis
Farzad Javidanrad
 
PPT
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Neeraj Bhandari
 
PPTX
UNIT 4.pptx
Mrunmayee Manjari
 
PPTX
Correlation and regression
Antony Raj
 
PPT
2-20-04.ppt
ayaan522797
 
PPT
Lecture 13 Regression & Correlation.ppt
shakirRahman10
 
PPTX
CORRELATION AND REGRESSION.pptx
Rohit77460
 
PPT
2-20-04.ppthjjbnjjjhhhhhhhhhhhhhhhhhhhhhhhh
Noorien3
 
PPT
correlation and r3433333333333333333333333333333333333333333333333egratio111n...
Ghaneshwer Jharbade
 
PPTX
Unit-III Correlation and Regression.pptx
Anusuya123
 
PPTX
6 the six uContinuous data analysis.pptx
AbasAhmed7
 
PPTX
Correlation and regression
Padma Metta
 
PPTX
Correlation.pptx
IloveBepis
 
PPTX
Regression and correlation in statistics
iphone4s4
 
Correlation and Regression ppt
Santosh Bhaskar
 
Regression and Co-Relation
nuwan udugampala
 
Correlation and Regression
Dr. Tushar J Bhatt
 
Correlation and Regression.pptx
Jayaprakash985685
 
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
Manjulasingh17
 
STATISTICAL REGRESSION MODELS
Aneesa K Ayoob
 
Introduction to correlation and regression analysis
Farzad Javidanrad
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Neeraj Bhandari
 
UNIT 4.pptx
Mrunmayee Manjari
 
Correlation and regression
Antony Raj
 
2-20-04.ppt
ayaan522797
 
Lecture 13 Regression & Correlation.ppt
shakirRahman10
 
CORRELATION AND REGRESSION.pptx
Rohit77460
 
2-20-04.ppthjjbnjjjhhhhhhhhhhhhhhhhhhhhhhhh
Noorien3
 
correlation and r3433333333333333333333333333333333333333333333333egratio111n...
Ghaneshwer Jharbade
 
Unit-III Correlation and Regression.pptx
Anusuya123
 
6 the six uContinuous data analysis.pptx
AbasAhmed7
 
Correlation and regression
Padma Metta
 
Correlation.pptx
IloveBepis
 
Regression and correlation in statistics
iphone4s4
 
Ad

More from Khulna University (11)

PPTX
Stat 2153 Introduction to Queiueng Theory
Khulna University
 
PPTX
Stat 2153 Stochastic Process and Markov chain
Khulna University
 
PPTX
Stat 3203 -sampling errors and non-sampling errors
Khulna University
 
PPTX
Stat 3203 -cluster and multi-stage sampling
Khulna University
 
PPTX
Stat 3203 -multphase sampling
Khulna University
 
PPTX
Stat 3203 -pps sampling
Khulna University
 
PPTX
Ds 2251 -_hypothesis test
Khulna University
 
PPTX
Stat 1163 -statistics in environmental science
Khulna University
 
PPTX
Introduction to matlab
Khulna University
 
PPTX
Different kind of distance and Statistical Distance
Khulna University
 
PPTX
Regression and Classification: An Artificial Neural Network Approach
Khulna University
 
Stat 2153 Introduction to Queiueng Theory
Khulna University
 
Stat 2153 Stochastic Process and Markov chain
Khulna University
 
Stat 3203 -sampling errors and non-sampling errors
Khulna University
 
Stat 3203 -cluster and multi-stage sampling
Khulna University
 
Stat 3203 -multphase sampling
Khulna University
 
Stat 3203 -pps sampling
Khulna University
 
Ds 2251 -_hypothesis test
Khulna University
 
Stat 1163 -statistics in environmental science
Khulna University
 
Introduction to matlab
Khulna University
 
Different kind of distance and Statistical Distance
Khulna University
 
Regression and Classification: An Artificial Neural Network Approach
Khulna University
 

Recently uploaded (20)

PPTX
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
PDF
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
DOCX
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
PPTX
Care of patients with elImination deviation.pptx
AneetaSharma15
 
PDF
What is CFA?? Complete Guide to the Chartered Financial Analyst Program
sp4989653
 
DOCX
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
PPTX
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
PPTX
Virus sequence retrieval from NCBI database
yamunaK13
 
PPTX
Basics and rules of probability with real-life uses
ravatkaran694
 
PPTX
Continental Accounting in Odoo 18 - Odoo Slides
Celine George
 
PPTX
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PPTX
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
DOCX
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
PPTX
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PPTX
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
Care of patients with elImination deviation.pptx
AneetaSharma15
 
What is CFA?? Complete Guide to the Chartered Financial Analyst Program
sp4989653
 
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
Virus sequence retrieval from NCBI database
yamunaK13
 
Basics and rules of probability with real-life uses
ravatkaran694
 
Continental Accounting in Odoo 18 - Odoo Slides
Celine George
 
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
Python-Application-in-Drug-Design by R D Jawarkar.pptx
Rahul Jawarkar
 
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 

Stat 1163 -correlation and regression

  • 1. Stat-1163: Statistics in Environmental Science Section B, Chapter: Correlation and Regession Md. Menhazul Abedin Lecturer Statistics Discipline Khulna University, Khulna-9208 Email: [email protected]
  • 3. Correlation • Variable – Independent variable (x) – Dependent variable (y) • They are related proportinaly or anti proportionally obviously • Let see an example • Cut your coat (y) according to your cloths(x)
  • 6. Correlation • Scatter plot: A scatterplot is a useful summary of a set of bivariate data (two variables), usually drawn before working out a linear correlation coefficient or fitting a regression line. It gives a good visual picture of the relationship between the two variables, and aids the interpretation of the correlation coefficient or regression model.
  • 7. Correlation • OKKK....... How to measue correlation ?? – Correlation coefficient • Correlation coefficient : Correlation coefficient is a quantitative measure of the direction and strength of linear relationship between two numerically measured variables.
  • 9. Correlation • Assumptions: – Both variables are measured on an interval or ratio scale – The variable follow normal distribution – The relationship betweeen variables is linear – The relationship is of adequate size to assume normality
  • 10. Correlation Interpretation: The value of r is always between +1 and –1. • Exactly –1. A perfect downhill (negative) linear relationship • –0.70. A strong downhill (negative) linear relationship • –0.50. A moderate downhill (negative) relationship • –0.30. A weak downhill (negative) linear relationship
  • 11. Correlation • 0. No linear relationship • +0.30. A weak uphill (positive) linear relationship • +0.50. A moderate uphill (positive) relationship • +0.70. A strong uphill (positive) linear relationship • Exactly +1. A perfect uphill (positive) linear relationship
  • 12. Correlation • Misinterpretations: – Does not demonstrate the causal relationship between two variables – R=0 does not mean that X and Y are not related, but that they are not linearly related. – Two variable can have a strong association but a small correlation coefficient r, if the relationship is not linear.
  • 13. Correlation • Properties: – Coefficient of Correlation lies between -1 and +1: The coefficient of correlation cannot take value less than - 1 or more than one +1. Symbolically, -1<=r<= + 1 or | r | <1 – Coefficients of Correlation are independent of Change of Origin: This property reveals that if we subtract any constant from all the values of X and Y, it will not affect the coefficient of correlation. – Coefficients of Correlation possess the property of symmetry: The degree of relationship between two variables is symmetric .
  • 14. Correlation – Coefficient of Correlation is independent of Change of Scale This property reveals that if we divide or multiply all the values of X and Y, it will not affect the coefficient of correlation. – Co-efficient of correlation measures only linear correlation between X and Y. – If two variables X and Y are independent, coefficient of correlation between them will be zero.
  • 20. Example-3 • Correlation of Gestational Age and Birth Weight
  • 23. Different correlations • Sprearman ranks correlation • Spurious correlations • Intraclass correlation • Tetrachoric correlation • Point bi-serial correlation • Bi-serial correlation
  • 24. Further study • Proof of properties • Distribution of correlatio coefficient • Test of correlation coefficient – Zero correlation test – Non-zero correlation test – Paired correlation test
  • 26. Learning Objectives 1. Describe the Linear Regression Model 2. State the Regression Modeling Steps 3. Explain Ordinary Least Squares 4. Compute Regression Coefficients 5. Predict Response Variable
  • 27. What is a Model? 1. Representation of Some Phenomenon Non-Math/Stats Model
  • 28. What is a Math/Stats Model? 1. Often Describe Relationship between Variables 2. Types - Deterministic Models (no randomness) - Probabilistic Models (with randomness)
  • 30. Example-1 Do you think days with temparature 100 , 130, 200, 250 𝐶 what will be the sales scenario ? This is a job of foreteller But we are not that. We have satatistics
  • 31. Example-1 1. Need staight line 2. There are Infinite straight line 3. Which line represent the phenomena is model 4. We have to modeling 𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 𝑦𝑖: Dependent variable 𝑥𝑖: Independent variable 𝜖𝑖: Random error 𝛼: Intercept 𝛽: Regression coefficient Minimize error and find out 𝛼 and 𝛽 for modeling (model fitting)
  • 32. Types of Regression Models Regression Model Multiple 1 dependent 2+ explanatory variable Linear Non-linear Multivariate 2+ dependent variable No restriction on explanatory variable Simple 1 dependent 1 explanatory variable
  • 33. Simple linear regression model • 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜖𝑖 – 𝑦𝑖: Dependent variable (known) – 𝑥𝑖: Independent variable (known) – 𝜖𝑖: Random error – 𝛽0: Intercept (unknown) – 𝛽1: Regression coefficient (unknown) • Minimize the error and find out 𝛽0 and 𝛽1 for modeling (model fitting)
  • 34. Simple linear regression line • 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 (regression line/ prediction equation) – 𝑦𝑖: Fitted values – 𝑥𝑖: Independent variable (known) – 𝛽0: Estimated intercept (general mean) – 𝛽1: Estimated regression coefficient (changing rate/ slope) • How to find 𝛽0 and 𝛽1 ????
  • 36. Coefficient Equations • Prediction equation • Sample slope • • Sample Y - intercept ii xy 10 ˆˆˆ            21 ˆ xx yyxx SS SS i ii xx xy  xy 10 ˆˆ  
  • 37. How to find 𝛽0 and 𝛽1 ???? • SLS/ OLS (Simple/ ordinary Least square) • WLS (Weighted Least square) • GLS (Generalized Least square)
  • 38. Least Squares • 1. ‘Best Fit’ Means Difference Between Actual Y Values & Predicted Y Values Are a Minimum. But Positive Differences Off-Set Negative. So square errors! 𝑖=1 𝑛 (𝑦𝑖 − 𝑦𝑖)2 = 𝑖=1 𝑛 𝜀𝑖 2 • 2. LS Minimizes the Sum of the Squared Differences (errors) (SSE)
  • 39. Assumptions 1. The regression model is linear in parameters. 2. The regression model is correctly specified. 3. X’s are fixed over repeated sample. 4. Errors are normaly distributed with mean zero and fixed variance i.e. 𝑒𝑖~𝑁(0, 𝜎2 ). 5. No perfect multicollinearity. 6. No autocorrelation of residuals
  • 40. Derivation of Parameters • Least Squares (L-S): Minimize squared error   22 0 1 1 1 n n i i i i i y x             22 0 1 0 0 0 1 0 2 i i iy x ny n n x                      xy 10 ˆˆ  
  • 41. Derivation of Parameters • Least Squares (L-S): Minimize squared error       22 0 1 1 1 0 1 1 1 0 2 2 i i i i i i i i i y x x y x x y y x x                                         1 1 1 ˆ i i i i i i i i xy xx x x x x y y x x x x x x y y SS SS                
  • 42. Derivation of Parameters • Prediction equation • Sample slope • • Sample Y – intercept • 0 and 1 are called OLSE (ordinary least square estimator) ii xy 10 ˆˆˆ            21 ˆ xx yyxx SS SS i ii xx xy  xy 10 ˆˆ  
  • 43. Interpretation of Coefficients • 1. Slope (1) – Estimated Y changes by 1 for each 1 unit increase in X • If 1 = 2, then Y is expected to increase by 2 for each 1 unit increase in X • 2. Y-Intercept (0 ) – Average value of Y when X = 0 • If 0 = 4, then average Y is expected to be 4 when X Is 0
  • 44. Example -1 • Consider the data obtained from a chemical process where the yield of the process is thought to be related to the reaction temperature (see the table below).
  • 46. • The least square estimates of the regression coefficients can be obtained for the data in the preceding table as follows: Example -1
  • 48. • Once the fitted regression line is known, the fitted value of corresponding to any observed data point can be calculated. For example, the fitted value corresponding to the 21st observation in the preceding table is: Example -1
  • 49. Properties of regression coefficients 1. The correlation coefficient is the geometric mean of two regression coefficients. Symbolically, it can be expressed as 𝑟 = (𝛽 𝑥𝑦 𝛽 𝑦𝑥) 1 2 2. Arithmetic mean of both regression coefficients is equal to or greater than coefficient of correlation. 𝛽 𝑥𝑦+𝛽 𝑦𝑥 2 ≥ 𝑟 1. The value of the coefficient of correlation cannot exceed unity. Therefore, if one of the regression coefficients is greater than unity, the other must be less than unity. 2. The regression coefficients are independent of the change of origin, but not of the scale.
  • 50. Definition • The regression analysis is a technique of studying the dependence of one variable (called dependent variable), on one or more variables (called explanatory variable), with a view to estimating or predicting the average value of the dependent variable in terms of the known or fixed values of the independent variables.
  • 51. Applications • The regression technique is primarily used to – Estimate the relationship that exists, on average, between the dependent variable and explanatory variable. – Determine the effect each of the explanatory variales on the dependent variable, controlling the effects of all other explanatory variables. – Predict the value of the dependent variable for a given value of the explanatory variable.