M8.logreg.ppt

DV: categorical (dichotomous )
Exploring Relationships
 Is a variable related to the proportions of another?
 The first step is to examine the data using crosstabs
 Chi square test
 Logistic regression relies on an estimation procedure
 Models the probability of an outcome
 Transforms the probability of an event occurring into its
odds
 In logistic regression the regression coefficient (b) can
be interpreted as the change in the log odds associated
with a one-unit increase change in the associate
predictor variable.
ln[Y/(1−Y)]=a + bX

Multiple Regression Logistic
Regression
LOGIT
 Used to make predictions
about an unknown event
from known evidence
 DV continuous
 IV can be any level of
measurement
 Assumes linear
relationship
 Uses least squares
estimation
(computed coefficients that minimized
the residuals for all cases)
 Normally distributed
variables
 Equal variance
 Used to determine which
variables affect the
probability of a particular
outcome
 DV categorical
 IV may be any level of
measurement
 Doesn’t assume linear
relationship but rather a
logit transformation
 Uses maximum
likelihood estimation
(when the dependent variable is not
normally distributed, ML estimates
are preferred to OLS estimates
because they are unbiased )
 Doesn’t assume normal
distribution, or equal
variance
 Less stringent

DV: categorical
The Odds Ratio
Prob (event)
Prob (not event)
= e
b
0
+ b
1
x
The left hand side of the equation is the odds:
where the e is the base of the natural log.
What this equation tells us is that the e, raised to the power of, say
x1, is the factor by which the odds change when x1 increases by
one unit,
controlling for the other variables in the equation.
When the coefficient is positive, the odds increase;
when the coefficient is negative, the odds decrease.
b
0
+ b
1
x
1
+ b
2
x
2
+ b
3
x
3
+ …..
= e
Crude OR –simple logistic regression
Adjusted OR-
multiple logistic regression

Examining likelihood of event
 Likelihood conventionally
expressed on a scale of 0 to 1
Many health outcomes
are dichotomous:
Depressed=1 (yes)
vs Depressed=0 (no)
 Can be used to compare
likelihood in groups:
Case vs controls
Males vs females
Chemo vs no chemo

What logistic regression predicts
 probability of Y occurring given known values for X(s).
 In LR the DV is transformed into the natural log of the odds. This is
called logit (short for logistic probability unit).
 Probabilities ranged between 0.0 and 1.0 are
transformed into odds ratios that range between 0 and
infinity.
 If the probability for group membership in the modeled
category is above some cut point (the default is 0.50), the
subject is predicted to be a member of the modeled group.
If the probability is below the cut point, the subject is
predicted to be a member of the other group.
 For any given case, logistic regression computes the
probability that a case with a particular set of values for the
independent variable is a member of the modeled
category.

Logistic Regression
 Logistic regression estimates the probability of a
certain event occurring using the odds ratio by
calculating the logarithm of the odds
 Uses Maximum likelihood estimation (MLE) to
transform the probability of an event occurring into
its odds, a nonlinear model
 Odds ratio is the probability of occurrence of a
particular event over the probability of non
occurrence
 Odds ratio is useful in providing an estimate of the
magnitude of the relationship between binary
variables
 Allows one to examine the effects of the variables
on the relationship- how y varies when x varies

model fit
 The probability of the observed results, given the
parameter estimates, are used to determine how
well the estimated model fits the data
 Likelihood index: If the model fits perfectly, the –
2LL, will equal 0. Goodness-of-fit statistic (similar to
the F test in multiple regression) takes into consideration the
difference between the observed probability of an
event and the predicted probability—chi square
distribution.
 Hosmer Lemeshow test (based on Chi Square)
compares prediction to “perfect model”. When not
significant, the null hypothesis that the model fits is
supported, ie a non significant result indicates

 R2 values quantify the proportion of the variance
explained by the model
 In logistic regression the Nagelkerke statistic is used
to estimate the Psuedo R squared, which is the
magnitude of the relationship between the dependent
variable and the set of independent variables in the
model.
 The b-weights and constant associated with each IV are
used in Logistic regression to determine the probability of a
subject doing one thing or the other
 Instead of a score as with continuous variables, a
probability ranging from 0 –1 is given. If the probability is
greater than .05, the prediction is for the occurrence and if
less than .05 for non occurrence
 Signs of the b-wgts show direction of relationship

M8.logreg.ppt

More Related Content

Similar to M8.logreg.ppt (20)

More from TanyaWadhwani4 (6)

Recently uploaded (20)

M8.logreg.ppt