Quantitative Skills For Animal Sciences-Day 2
Quantitative Skills For Animal Sciences-Day 2
YAS33803
3. Data quality
Are there missing observations?
How were missing values treated?
Are there any “outliers” – strange data points?
81
Exercise
Output
83
Exploratory analyses - longevity
84
Exploratory analyses - longevity
86
Exploratory analyses - histogram
(Strong) deviations from normality
88
Exploratory analyses - outliers
Outlier:
a value that is far from the others: it is an unusually
large or an unusually small value compared to the others
89
Exploratory analyses - outliers
1. Explanation and can we fix it?? Was the value entered into the
computer correctly? If there was an error in data entry, fix it.
2. Is there a justification to exclude the value resulting from that
analysis? Were there any experimental problems with that
value?
3. Is the outlier caused by “normal” variation? The
observation/individual may be different from the others. This
may be the most exciting finding in your data!
90
Exploratory analyses - outliers
91
Exploratory analyses - histogram
Rounding BW to nearest 10 g
Rounding BW to nearest 5 g
Missing BW
Model
yij = + Classi .......+ Xij + eij
96
Exercise
97
Exploratory analyses
98
Exploratory analyses
99
Example - Egg shell strength in Poultry
101
Digression - Egg shell strength in Poultry
Results including 2 individuals with WEIGHT=0, n=539
Excluding 2
date errors has
a big impact!!
102
Exploratory analyses
Quantitative explanatory variables – regressors / co-variable
How are they distributed? Do not need to be normally
distributed but a clear bimodal pattern or a strange value
(outlier) might affect results.
103
Exploratory analyses
104
Exploratory analyses
Model
yij = + Classi .......+ Xij + eij
Right-click Paste
worksheet tick box content in
“Create a new
copy” column
107
Exploratory analyses
=COUNTIF(B$2:B$126,I2)
108
Exploratory analyses
Model
yij = + Classi .......+ Xij + eij
110
Exploratory analyses
111
Data analysis
a) Exploratory analyses
a.1) Initial examination of the data.
a.2) Relations explanatory variables
and the response variable.
a.3) Relations among explanatory
variables.
a.4) Conclusions based on the
exploratory analysis.
112
Exploratory analyses: explanatory-response
113
Exploratory analyses: explanatory-response
114
Exercise
115
Exploratory analyses: explanatory-response
=AVERAGEIF(B$2:B$126,I2,C$2:C$126)
116
Exploratory analyses: explanatory-response
Classes n Average SD
0-0 25 63.56
1-0 25 64.8 Standard Deviation??
1-1 25 56.76 Use “IF” function:
8-0 25 63.36
8-1 25 38.72
Range: column “Long”
=STDEV.S(IF($B$2:$B$126=I2,$C$2:$C$126,FALSE))
Range: column “Group”
Longevity (days)
Group n Average SD Are differences between
0-0 25 63.56 16.45 Groups (treatments) large??
1-0 25 64.80 15.65
1-1 25 56.76 14.93 Do you expect “Group” to
8-0 25 63.36 14.54
8-1 25 38.72 12.10
have a significant effect on
All 125 57.44 17.56 Longevity??
118
Exploratory analyses: explanatory-response
Longevity (days)
Group n Average SD
0-0 25 63.56 16.45
Difference means:
1-0 25 64.80 15.65
1-1 25 56.76 14.93
8-0 25 63.36 14.54
8-1 25 38.72 12.10
( )
Alternative:
Data Analysis – Anova: Single
Factor
Reorganize
data
120
Exercise
121
Exploratory analyses: explanatory-response
Output
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
0-0 25 1589 63.56 270.67
1-0 25 1620 64.8 245
1-1 25 1419 56.76 222.85
8-0 25 1584 63.36 211.40
8-1 25 968 38.72 146.46
ANOVA
SS
Source of Variation df MS F P-value F crit
Between Groups 11939.2 4 2984.82 13.61195 3.52E-09 2.447237
Within Groups 26313.5 120 219.2793
122
Exploratory analyses: explanatory-response
123
Exploratory analyses: explanatory-response
Create 1.Select X
Scatterplot &Y
Add trendline
to plot
Right mouse
Click “series” 124
Exercise
125
Long Thor
0.6
40 4
0.7
37 0
0.7
44 2
120
0.8
96 8
0.9
58 2
0.9
62 2
0.9
70 2
0.9
72 2
0.9
75 2
100
0.9
96 2
0.9
75 4
0.6
Longevity (days)
46 4
0.6
42 8
0.7
65 2
0.7
46 6
80
0.7
58 6
0.8
42 0
0.8
48 0
0.8
58 0
0.8
50 2
0.8
80 2
0.8
60
63 4
0.8
65 4
0.8
70 4
0.8
70 4
0.8
72 4
0.8
97 4
0.8
40
46 8
0.8
56 8
0.8
70 8
0.8
70 8
0.8
72 8
0.8
76 8
0.8
90 8
20
0.9
76 2
0.9
92 2
0.6
21 8
0.6
40 8
0.7
44 2
0.7
54 6
0
0.7
36 8
0.8
40 0
0.8
56 0
126
37 8
0.6
49 8
0.7
46 2
0.7
63 2
0.7
39 6
0.7
46 6
0.7
56 6
Long Thor
0.6
40 4
0.7
37 0
0.7
44 2
120
0.9
72 2
y = 144.33x - 61.052
75 4
0.6
46 4
100
0.6
Longevity (days)
42 8
(linear) relationship
0.7
R² = 0.4051
65 2
0.7
46 6
0.7
58 6
0.8
42 0
80
0.8
48 0
60
0.8
65 4
0.8
70 4
0.8
70 4
0.8
72 4
0.8
97 4
0.8
40
• Note: linear model….but
46 8
0.8
56 8
0.8
70 8
0.8
70 8
0.8
72 8
0.8
20
we can test if non-linear
76 8
0.8
90 8
0.9
76 2
0.9
92 2
0.6
21 8
0.6
0
relationships give a better
40 8
0.7
44 2
0.7
54 6
0.7
fit:
56 0
0.8
60 0
127
37 8
0.6
49 8
0.7
46 2
0.7
63 2
0.7
39 6
0.7
46 6
0.7
56 6
Exploratory analyses: explanatory-response
Quadratic relationship:
Long = β0+β1(Thor)+ β2(Thor2)
129
Exploratory analyses: explanatory-response
120
100
Longevity (days)
80
60
40
20
0
0 10 20 30 40 50 60 70 80 90 100
%Sleep
130
Exploratory analyses: explanatory-response
120
y = 0.0046x + 57.332
100 R² = 2E-05
Longevity (days)
80
60
40
20
0
0 10 20 30 40 50 60 70 80 90 100
%Sleep
131
Exploratory analyses: explanatory-response
132
Data analysis
a) Exploratory analyses
a.1) Initial examination of the data.
a.2) Relations explanatory variables
and the response variable.
a.3) Relations among explanatory
variables.
a.4) Conclusions based on the
exploratory analysis.
133
Exploratory analyses: among explanatory
134
Exploratory analyses: among explanatory
Confounding
Two variables are confounded if they vary together in
such a way that it is impossible to determine which
variable is responsible for an observed effect.
135
Exploratory analyses: among explanatory
Experiment comparing two treatments for depression
Treatment
1 2
Young # -
Age
Old - #
136
Exploratory analyses: among explanatory
Cow
1 2 . . j
Cows only occur
1 *** ***
2 *** ***
on one herd: cows
are “nested”
Herd
.
. within herds
i ***
Cow
1 2 . . j
1 *** *** *** *** ***
2 *** *** *** *** *** The same cow can
not be present on
Herd
139
Exploratory analyses: among explanatory
%Sleep
8-1 25 0.80 0.08 50
40
2.“Group” and “Sleep” 30
Sleep 20
Group n Average SD 10
0-0 25 21.6 12.5 0
1-0 25 24.1 16.7
0.50 0.60 0.70 0.80 0.90 1.00
1-1 25 25.8 18.4
8-0 25 25.2 19.8 Thor (mm)
8-1 25 20.8 10.7
140
Exploratory analyses: among explanatory
“Anova: “Single Factor” (Thorax) in EXCEL
Groups Count Sum Average Variance SD
0-0 25 20.90 0.8360 0.0071 0.084
.
1-0 25 20.64 0.8256 0.0049 0.070
1-1 25 20.94 0.8376 0.0050 0.071
8-0 25 20.14 0.8056 0.0067 0.082
8-1 25 20.00 0.8000 0.0061 0.078
p= 0.29
141
Exploratory analyses: among explanatory
142
Exploratory analyses
143
Exploratory analyses: among explanatory
144
Data analysis
a) Exploratory analyses
a.1) Initial examination of the data.
a.2) Relations explanatory variables
and the response variable.
a.3) Relations among explanatory
variables.
a.4) Conclusions based on the
exploratory analysis.
145
Exploratory analyses – wrapping up
146
Exploratory analyses- – wrapping up
148