0% found this document useful (0 votes)
25 views7 pages

Chapter 11 - 250305 - 102157

The document contains practice questions related to linear regression, including calculations for regression lines, residuals, and confidence intervals based on various datasets. It covers topics such as estimating expected values, testing hypotheses, and analyzing the fit of models through ANOVA and residual plots. Additionally, it includes exercises on transforming models to linear form and interpreting the results of regression analyses.

Uploaded by

Dheer Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views7 pages

Chapter 11 - 250305 - 102157

The document contains practice questions related to linear regression, including calculations for regression lines, residuals, and confidence intervals based on various datasets. It covers topics such as estimating expected values, testing hypotheses, and analyzing the fit of models through ANOVA and residual plots. Additionally, it includes exercises on transforming models to linear form and interpreting the results of regression analyses.

Uploaded by

Dheer Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

om

i.c
CS1-11: Linear regression Page 27

ng
si
om
o m
Chapter 11 Practice Questions

as
.m
w
11.1 A new computerised ultrasound scanning technique has enabled doctors to monitor the weights

w
w
of unborn babies. The table below shows the estimated weights for one particular baby at
fortnightly intervals during the pregnancy.

Gestation period (weeks) 30 32 34 36 38 40

Estimated baby weight (kg) 1.6 1.7 2.5 2.8 3.2 3.5

 x  210  x 2  7,420  y  15.3  y 2  42.03  xy  549.8


(i) Show that:

(a) Sxx  70, Syy  3.015 and Sxy  14.3 .

(b) the fitted regression line is yˆ  4.60  0.2043x .

(c) ˆ 2  0.0234 .

(ii) Calculate the baby’s expected weight at 42 weeks (assuming it hasn’t been born by then).

(iii) (a) Calculate the residual sum of squares and the regression sum of squares for these
data.

(b) Calculate the coefficient of determination, R2 , and comment on its value.

(iv) Carry out a test of H0 :   0 vs H1 :   0 , assuming a linear model is appropriate.

(v) Construct an ANOVA table for the sum of squares from part (iii)(a) and carry out an F-test
stating the conclusion clearly.

(vi) (a) Estimate the mean weight of a baby at 33 weeks. Calculate the variance of this
mean predicted response.

(b) Hence, calculate a 90% confidence interval for the mean weight of a baby at 33
weeks.

(vii) (a) Estimate the actual weight of an individual baby at 33 weeks. Calculate the
variance of this individual predicted response.

(b) Hence, calculate a 90% confidence interval for the weight of an individual baby at
33 weeks.

[ctd.]

The Actuarial Education Company © IFE: 2019 Examinations


om
i.c
Page 28 CS1-11: Linear regression

ng
si
om
m
The table below shows some of the residuals:

o
as
.m
w
Gestation period (weeks) 30 32 34 36 38 40

w
w
Residual 0.07 0.05 0.04 0.07

(viii) (a) Calculate the missing residuals.

(b) Draw a dotplot of the residuals and comment.

(c) Plot the residuals against the x values and comment on the fit.

(d) Comment on the Q-Q plot of the residuals given below:

11.2 An analysis using the simple linear regression model based on 19 data points gave:

sxx  12.2 syy  10.6 sxy  8.1

(i) (a) Calculate ˆ .

(b) Test whether  is significantly different from zero.

(ii) (a) Calculate r .

(b) Test whether  is significantly different from zero.

(iii) Comment on the results of the tests in parts (i) and (ii).

© IFE: 2019 Examinations The Actuarial Education Company


om
i.c
CS1-11: Linear regression Page 29

ng
si
om
m
11.3 The sums of the squares of the errors in a regression analysis are found to be:

o
as
.m
SSREG   (yˆi  y )2  6.4 SSRES   (yi  yˆi )2  3.6 SSTOT   (yi  y )2  10.0

w
w
w
Calculate the coefficient of determination and explain what this represents.

11.4 Explain how to transform the following models to linear form:

(i) yi  a  bxi2  ei

(ii) yi  ae bxi

11.5 A university wishes to analyse the performance of its students on a particular degree course. It
records the scores obtained by a sample of 12 students at entry to the course, and the scores
Exam style
obtained in their final examinations by the same students. The results are as follows:

Student A B C D E F G H I J K L
Entrance exam score x (%) 86 53 71 60 62 79 66 84 90 55 58 72
Finals paper score y (%) 75 60 74 68 70 75 78 90 85 60 62 70

 x  836  y  867  x 2  60,016  y 2  63,603  (x  x )(y  y )  1,122


(i) Calculate the fitted linear regression equation of y on x. [3]

(ii) Assuming the full normal model, calculate an estimate of the error variance  2 and
obtain a 90% confidence interval for  2 . [3]

(iii) By considering the slope parameter, formally test whether the data are positively
correlated. [3]

(iv) Calculate a 95% confidence interval for the mean finals paper score corresponding to an
individual entrance score of 53. [3]

(v) Calculate the proportion of variation explained by the model. Hence, comment on the fit
of the model. [2]
[Total 14]

The Actuarial Education Company © IFE: 2019 Examinations


om
i.c
Page 30 CS1-11: Linear regression

ng
si
om
m
11.6 The share price, in pence, of a certain company is monitored over an 8-year period. The results

o
as
are shown in the table below:

.m
Exam style

w
w
Time (years) 0 1 2 3 4 5 6 7 8

w
Price 100 131 183 247 330 454 601 819 1,095

 (xi  x )2  60  (yi  y )2  925,262  (xi  x )(yi  y )  7,087


An actuary fits the following simple linear regression model to the data:

yi     xi  ei i  0,1,  ,8

where {ei } are independent normal random variables with mean zero and variance  2 .

(i) Determine the fitted regression line in which the price is modelled as the response and
the time as an explanatory variable. [2]

(ii) Calculate a 99% confidence interval for:

(a)  , the true underlying slope parameter

(b)  2 , the true underlying error variance. [5]

(iii) (a) State the ‘total sum of squares’ and calculate its partition into the ‘regression sum
of squares’ and the ‘residual sum of squares’.

(b) Use the values in part (iii)(a) to calculate the ‘proportion of variability explained by
the model’ and comment on the result. [5]

(iv) The actuary decides to check the fit of the model by calculating the residuals.

(a) Complete the table of residuals (rounding to the nearest integer):

Time (years) 0 1 2 3 4 5 6 7 8
Residual 132 21 75 104 75 25

(b) Use a dotplot of the residuals to comment on the assumption of normality.

(c) Plot the residuals against time and hence comment on the appropriateness of the
linear model. [7]
[Total 19]

© IFE: 2019 Examinations The Actuarial Education Company


om
i.c
CS1-11: Linear regression Page 31

ng
si
om
m
11.7 A schoolteacher is investigating the claim that class size does not affect GCSE results. His

o
as
observations of nine GCSE classes are as follows:

.m
Exam style

w
w
Class X1 X2 X3 X4 Y1 Y2 Y3 Y4 Y5

w
Students in class ( c ) 35 32 27 21 34 30 28 24 7
Average GCSE point
5.9 4.1 2.4 1.7 6.3 5.3 3.5 2.6 1.6
score for class ( p )

 c  238  c 2  6,884  p  33.4  p2  149.62  cp  983


(i) Determine the fitted regression line for p on c . [3]

(ii) Class X5 was not included in the results above and contains 15 students. Calculate an
estimate of the average GCSE point score for this individual class and specify the standard
error for this estimate assuming the full normal model. [4]
[Total 7]

11.8 An actuary is fitting the following linear regression model through the origin:
Exam style
Yi   xi  ei ei  N(0, 2 ) i  1,2,  n

(i) Show that the least squares estimator of  is given by:

ˆ   xiYi [3]
 xi2
(ii) Derive the bias and mean square error of ˆ under this model. [4]
[Total 7]

The Actuarial Education Company © IFE: 2019 Examinations


om
i.c
Page 32 CS1-11: Linear regression

ng
si
om
m
A life assurance company is examining the force of mortality, x , of a particular group of

o
11.9

as
policyholders. It is thought that it is related to the age, x , of the policyholders by the formula:

.m
Exam style

w
w
 x  Bc x

w
It is decided to analyse this assumption by using the linear regression model:

Yi     xi   i where  i  N(0, 2 ) are independently distributed

The summary results for eight ages were as follows:

Age, x 30 32 34 36 38 40 42 44

Force of mortality, x
5.84 6.10 6.48 7.05 7.87 9.03 10.56 12.66
( 104 )

ln x (3 s.f.)  7.45  7.40  7.34  7.26  7.15  7.01  6.85  6.67

 xi  296  xi2  11,120 ln x i


 57.129 (ln x )2  408.50  xi ln x
i i
 2,104.5

(i) (a) Apply a transformation to the original formula,  x  Bc x , to make it suitable for
analysis by linear regression. Hence, write down expressions for Y ,  and  in
terms of x , B and c .

(b) Plot a graph of ln x against the age of the policyholder, x . Hence comment on
the suitability of the regression model and state how this supports the
transformation in part (a). [4]

(ii) Use the data to calculate least squares estimates of B and c in the original formula. [3]

(iii) (a) Calculate the coefficient of determination between ln x and x . Hence comment
on the fit of the model to the data.

(b) Complete the table of residuals and use it to comment on the fit. [5]

Age, x 30 32 34 36 38 40 42 44

Residual, eˆi 0.08 0.03 0.06 0.02 0.09

(iv) Calculate a 95% confidence interval for the mean predicted response ln 35 and hence
obtain a 95% confidence interval for the mean predicted value of 35 . [4]
[Total 16]

© IFE: 2019 Examinations The Actuarial Education Company


om
i.c
CS1-11: Linear regression Page 33

ng
si
om
m
11.10 The government of a country suffering from hyperinflation has sponsored an economist to monitor

o
as
the price of a ‘basket’ of items in the population’s staple diet over a one-year period. As part of his

.m
Exam style
study, the economist selected six days during the year and on each of these days visited a single

w
w
nightclub, where he recorded the price of a pint of lager. His report showed the following prices:

w
Day ( i ) 8 29 57 92 141 148
Price ( Pi ) 15 17 22 51 88 95

lnPi 2.7081 2.8332 3.0910 3.9318 4.4773 4.5539

 i  475  i 2  54,403  lnPi  21.5953  (ln Pi )2  81.1584  i ln Pi  1,947.020


The economist believes that the price of a pint of lager in a given bar on day i can be modelled by:

lnPi  a  bi  ei

where a and b are constants and the ei ’s are uncorrelated N(0, 2 ) random variables.

(i) Estimate a , b and  2 . [5]

(ii) Calculate the linear correlation coefficient r . [1]

(iii) Calculate a 99% confidence interval for b . [2]

(iv) Determine a 95% confidence interval for the average price of a pint of lager on day 365:

(a) in the country as a whole

(b) in a randomly selected bar. [7]


[Total 15]

11.11 (i) Show that the maximum likelihood estimates (MLEs) of  and  in the simple linear
Exam style
regression model are identical to their least squares estimates. [5]

(ii) Show that the MLE of  2 has a different denominator to the least squares estimate. [4]
[Total 9]

The Actuarial Education Company © IFE: 2019 Examinations

You might also like