Scatter Diagrams
A scatterplot is a graph that may be
used to represent the relationship
between two variables—also
referred to as a scatter diagram.
3.
Dependent and Independent
Variables
Adependent variable is the variable to be
predicted or explained in a regression
model. This variable is assumed to be
functionally related to the independent
variable.
4.
Dependent and Independent
Variables
Anindependent variable is the variable
related to the dependent variable in a
regression equation. The independent
variable is used in a regression model to
estimate the value of the dependent
variable.
Correlation
The correlation coefficientis a quantitative
measure of the strength of the linear
relationship between two variables. The
correlation ranges from + 1 to - 1. A
correlation of 1 indicates a perfect linear
relationship, whereas a correlation of 0
indicates no linear relationship.
11.
Correlation
SAMPLE CORRELATION COEFFICIENT
where:
r= Sample correlation coefficient
n = Sample size
x = Value of the independent variable
y = Value of the dependent variable
]
)
(
][
)
(
[
)
)(
(
2
2
y
y
x
x
y
y
x
x
r
Correlation
TEST STATISTIC FORCORRELATION
where:
t = Number of standard deviations r is from 0
r = Simple correlation coefficient
n = Sample size
2
1 2
n
r
r
t
2
n
df
17.
306
.
2
025
.
t
0
Correlation SignificanceTest
Rejection Region
/2 = 0.025
Since t=4.37 > 2.306, reject H0, there is a significant
linear relationship
306
.
2
025
.
t
Rejection Region
/2 = 0.025
05
.
0
0
:
)
(
0
:
0
A
H
n
correlatio
no
H
37
.
4
8
7052
.
0
1
8398
.
0
2
1 2
n
r
r
t
18.
Simple Linear Regression
Analysis
Simplelinear regression analysis
analyzes the linear relationship that
exists between a dependent variable
and a single independent variable.
19.
Simple Linear Regression
Analysis
SIMPLELINEAR REGRESSION MODEL
(POPULATION MODEL)
where:
y = Value of the dependent variable
x = Value of the independent variable
= Population’s y-intercept
= Slope of the population regression line
= Error term, or residual
x
y 1
0
0
1
20.
Simple Linear Regression
Analysis
Thesimple linear regression model has four
assumptions:
Individual values of the error terms, i, are
statistically independent of one another.
The distribution of all possible values of is
normal.
The distributions of possible i values have equal
variances for all value of x.
The means of the dependent variable, for all specified
values of the independent variable, y, can be
connected by a straight line called the population
regression model.
Simple Linear Regression
Analysis
Theinterpretation of the regression slope
coefficient is that is gives the average change
in the dependent variable for a unit change in
the independent variable. The slope
coefficient may be positive or negative,
depending on the relationship between the
two variables.
Simple Linear Regression
Analysis
Aresidual is the difference between
the actual value of the dependent
variable and the value predicted by
the regression model.
y
y ˆ
25.
Simple Linear Regression
Analysis
ESTIMATEDREGRESSION MODEL
(SAMPLE MODEL)
where:
= Estimated, or predicted, y value
b0 = Unbiased estimate of the regression intercept
b1 = Unbiased estimate of the regression slope
x = Value of the independent variable
x
b
b
yi 1
0
ˆ
ŷ
26.
Simple Linear Regression
Analysis
LEASTSQUARES EQUATIONS
algebraic equivalent:
and
n
x
x
n
y
x
xy
b 2
2
1
)
(
2
1
)
(
)
)(
(
x
x
y
y
x
x
b
x
b
y
b 1
0
Interpretation of Results:
Example
Theslope of 0.766 means that for each increase of
one unit in X, we predict the average of Y to
increase by an estimated 0.766 units.
The equation estimates that for each increase of 1
point on the math achievement test, the expected
final calculus grades are predicted to increase by
0.766 points.
)
(
766
.
0
78
.
40
ˆ x
y
30.
Simple Linear RegressionAnalysis
Linear Regression
20.00 30.00 40.00 50.00 60.00 70.00
mathscor
60.00
70.00
80.00
90.00
grad
e
A
A
A
A
A
A
A
A
A
A
grade = 40.78 + 0.77 * mathscor
31.
Simple Linear Regression
Analysis
Thecoefficient of determination is the
portion of the total variation in the
dependent variable that is explained by its
relationship with the independent variable.
The coefficient of determination is also
called R-squared and is denoted as R2
.
Simple Linear Regression
Analysis
COEFFICIENTOF DETERMINATION
SINGLE INDEPENDENT VARIABLE CASE
where:
R2
= Coefficient of determination
r = Simple correlation coefficient
2
2
r
R
34.
Coefficients of Determination(r 2
)
and Correlation (r)
r2
= 1, r2
= 1,
r2
= .81, r2
= 0,
Y
Yi = b0 + b1Xi
X
^
Y
Yi = b0 + b1Xi
X
^
Y
Yi = b0 + b1Xi
X
^
Y
Yi = b0 + b1Xi
X
^
r = +1 r = -1
r = +0.9 r = 0
35.
Simple Regression Steps
Developa scatter plot of y and x. You are
looking for a linear relationship between
the two variables.
Calculate the least squares regression line
for the sample data.
Calculate the correlation coefficient and the
simple coefficient of determination, R2
.
Conduct one of the significance tests.