Introduction to Biostatistics
Dr. A.V. Dusane
anildusane@gmail.com
1
Introduction to Biostatistics
• Definition: Biometry or Biostatics deals with the application of
statistical methods for the analysis of biological variations,
correlation’s and regression.
• Biometry helps to interpret a large number of observations (data),
to understand the complex relations among the different factors
which govern the biological phenomenon and to draw inferences
from a limited number of measurements.
• Sir Francis Galton is the founder of biometry
• W.F.R. Weldom coined the term biometry.
2
Important Biostatistical terms
• Population: It is the totality or aggregate of individuals with specified characteristics. It may be
finite or infinite. E.g. number of plants in a quadrat is finite while number of phytoplanktons in a
pond represents infinite population.
• Sample: It is the part selected from the population or a group of individuals selected from a
particular population. It is used for learning about the whole population by observing a few
individuals.
• Sampling: The process of selection of the part of population to represent the whole population is
referred as sampling.
• Primary data: When a set of data is collected through personal investigation from the original
source or performing some experiment is called primary data.
• Secondary data: When the data collected by some other sources and is used after some processing
form further analysis then the data is called secondary data.
• Qualitative data: The character of a population which cannot be numerically expressed such as
colour of flower, nature of seed coat etc. is called qualitative data.
3
Important Biostatistical terms
• Quantitative data: The magnitude of any character/parameter which can be
numerically measured is called quantitative data.
• Parameter: It is any numerical quantity that characterizes a given population or
some aspect of it. This means the parameter tells us something about the whole
population.
• Statistics: It is a branch of science which deals with methods of collection,
classification and analysis. It also deals with testing of hypothesis and drawing
inferences from the collected data.
• Variables: Any character (qualitative or quantitative) or parameter, which is
likely to vary from individual to individual in the same population, is known as
variable or variate. E.g. colour of flowers, height of plant, etc. 4
Important Biostatistical terms
• Discrete variable: It is the variable, which can not take fractional value and therefore it is an integer.
E.g. number of seeds in a pod, no. Of grains in the ear head, no. Of plants in a given quadrat etc.
Number of seeds in pod is always in integers i.e. 5, 6,7 etc. but it will not be 5.1,6.9,7.2 etc.
• Continuous variable: It is the variable, which can have a value in fraction. E.g. height, weight,
length, volume, etc. Height of plant can be in fraction (9.2 cm)
• Statistical error or disturbance: It is the amount by which an observation differs from its expected
value, the latter being based on the whole population from which the statistical unit was chosen
randomly.
• Linear functions: A function that can be graphically represented in the Cartesian coordinate plane
by a straight line is called a Linear Function. It is represented with F(x) = m x + c, where m and c are
constants and x is a real variable. The constant m is called slope and c is called y-intercept.
5
Important Biostatistical terms
• Non-linear functions: A function that cannot be graphically
represented in the Cartesian coordinate plane by a straight line is
called a non-linear Function. This is not a linear function since the
points do not fit onto a straight line
• Frequency distribution: The manner in which the frequencies are
distributed over different classes is called frequency distribution.
• Data: It is a collective term referring to a group of observations, as a
unit.
6
Scope and applications of Biostatistics
• A good understanding of biometry is essential, as the methods of
biometry are used for designing, analyzing and interpretation of the data
for drawing dependable conclusions.
• It helps biologist to understand the variability that exists in nature, to
understand complex interactions and to get a feel of life processes.
• By applying biostatistical methods, one can reach to the worthwhile
conclusions.
• Biometry has lot of applications in majority of branches of Biology such
as agriculture, genetics, plant breeding, ecology, physiology,
biochemistry, molecular biology, taxonomy etc.
7
Agriculture
• Biometry plays a significant role in the analysis of huge and complex
data and its interpretation. For instance, Animal scientists use
statistical procedures to analyze the data of different breeds for
decision purposes.
• Biometry provides the accurate statistical data analysis in breeding
programmes.
• Animal nutritionists use statistical techniques for study the impact of
new feeds on growth of a animal. Especially to find out optimum
nutritional needs.
• Agricultural economists use statistical based forecasting procedures
in order to determine the future demand and supply of a particular
crop product.
• The incorrect projections regarding the future demand of a crop or
crops affect the whole economy.
8
Agriculture
• Agricultural engineers use statistical procedures in many
areas such as irrigation research, modes of cultivations,
design of harvesting etc.
• Bio-statistics is useful in studying the various agronomic
features viz. weight of a fruit, yield of a crop/acre, number of
seeds/pod, number of tillers/plant, height of a crop, etc. The
estimation of these agronomic features helps to evaluate the
performance of different varieties of a crop.
• The the sound statistical procedures are needed in agriculture
for the design of experiments, as well as in the analysis of
data.
9
Genetics
• Biostatistics has applications in the qualitative and
quantitative genetics. For instance, Gregor Mendel by
applying statistical analysis could propose the laws of
inheritance.
• By applying the probability models, the segregation ratios as
observed by Mendel in pea would have been derived.
• Biostatistics plays an important role in understanding the
concept of polygenic inheritance. Galton (1888) and Karl
Pearson (1905) tried to explain polygenic inheritance through
correlation and regression studies.
10
Genetics
• Biometry finds applications in estimating the gene
frequency and genotype frequencies.
• Chi-square test is the most widely used in the
interpreting genetic experiments especially in the
detection of single factor ratios, linkages and
heterogeneity.
• The strength of the linkage is be measured by
calculating recombination percentage.
11
Plant Physiology
• The strength of a relationship among the various
physiological phenomenon and different external and
internal factors can be estimated by correlation
coefficients.
• Regression analysis is more useful when two or more
variables control the same phenomenon and to
establish correlation.
• The statistical methods can be used to estimate the
germinability rate of seeds, rate of photosynthesis,
productivity rate of a particular alkaloid etc.
12
Ecology
• Statistical methods are widely used in vegetation
studies, especially, in estimating the frequency,
abundance, distribution, density of a particular plant
species or animal group in a specified geographical
area.
• Biometry enables us to estimate the biodiversity index
of a locality.
• Some statistical models are used to forecast the
incidences of epidemics, the rate of the accumulation
of different pollutants in the ecosystem, etc.
13
Taxonomy
• Traditional methods are inadequate to classify a plant
or animal species in particular taxa due to the
insufficient knowledge about the affinity or similarity
index.
• Numerical taxonomy i.e. the numerical evaluation of
the similarity or affinity between the taxonomic units is
useful in the classification systems.
• The indices like coefficients of association and
correlation has many applications in taxonomy.
14
Molecular Biology
• Statistical methods are used for calculating the
recombination percentage.
• Biostatistics is essential in the construction of
gene maps.
• Biostatistics has application in the determination
of sequence of nucleotides, Preparations of
cladograms, etc.
15

Introduction of Biostatistics

  • 1.
  • 2.
    Introduction to Biostatistics •Definition: Biometry or Biostatics deals with the application of statistical methods for the analysis of biological variations, correlation’s and regression. • Biometry helps to interpret a large number of observations (data), to understand the complex relations among the different factors which govern the biological phenomenon and to draw inferences from a limited number of measurements. • Sir Francis Galton is the founder of biometry • W.F.R. Weldom coined the term biometry. 2
  • 3.
    Important Biostatistical terms •Population: It is the totality or aggregate of individuals with specified characteristics. It may be finite or infinite. E.g. number of plants in a quadrat is finite while number of phytoplanktons in a pond represents infinite population. • Sample: It is the part selected from the population or a group of individuals selected from a particular population. It is used for learning about the whole population by observing a few individuals. • Sampling: The process of selection of the part of population to represent the whole population is referred as sampling. • Primary data: When a set of data is collected through personal investigation from the original source or performing some experiment is called primary data. • Secondary data: When the data collected by some other sources and is used after some processing form further analysis then the data is called secondary data. • Qualitative data: The character of a population which cannot be numerically expressed such as colour of flower, nature of seed coat etc. is called qualitative data. 3
  • 4.
    Important Biostatistical terms •Quantitative data: The magnitude of any character/parameter which can be numerically measured is called quantitative data. • Parameter: It is any numerical quantity that characterizes a given population or some aspect of it. This means the parameter tells us something about the whole population. • Statistics: It is a branch of science which deals with methods of collection, classification and analysis. It also deals with testing of hypothesis and drawing inferences from the collected data. • Variables: Any character (qualitative or quantitative) or parameter, which is likely to vary from individual to individual in the same population, is known as variable or variate. E.g. colour of flowers, height of plant, etc. 4
  • 5.
    Important Biostatistical terms •Discrete variable: It is the variable, which can not take fractional value and therefore it is an integer. E.g. number of seeds in a pod, no. Of grains in the ear head, no. Of plants in a given quadrat etc. Number of seeds in pod is always in integers i.e. 5, 6,7 etc. but it will not be 5.1,6.9,7.2 etc. • Continuous variable: It is the variable, which can have a value in fraction. E.g. height, weight, length, volume, etc. Height of plant can be in fraction (9.2 cm) • Statistical error or disturbance: It is the amount by which an observation differs from its expected value, the latter being based on the whole population from which the statistical unit was chosen randomly. • Linear functions: A function that can be graphically represented in the Cartesian coordinate plane by a straight line is called a Linear Function. It is represented with F(x) = m x + c, where m and c are constants and x is a real variable. The constant m is called slope and c is called y-intercept. 5
  • 6.
    Important Biostatistical terms •Non-linear functions: A function that cannot be graphically represented in the Cartesian coordinate plane by a straight line is called a non-linear Function. This is not a linear function since the points do not fit onto a straight line • Frequency distribution: The manner in which the frequencies are distributed over different classes is called frequency distribution. • Data: It is a collective term referring to a group of observations, as a unit. 6
  • 7.
    Scope and applicationsof Biostatistics • A good understanding of biometry is essential, as the methods of biometry are used for designing, analyzing and interpretation of the data for drawing dependable conclusions. • It helps biologist to understand the variability that exists in nature, to understand complex interactions and to get a feel of life processes. • By applying biostatistical methods, one can reach to the worthwhile conclusions. • Biometry has lot of applications in majority of branches of Biology such as agriculture, genetics, plant breeding, ecology, physiology, biochemistry, molecular biology, taxonomy etc. 7
  • 8.
    Agriculture • Biometry playsa significant role in the analysis of huge and complex data and its interpretation. For instance, Animal scientists use statistical procedures to analyze the data of different breeds for decision purposes. • Biometry provides the accurate statistical data analysis in breeding programmes. • Animal nutritionists use statistical techniques for study the impact of new feeds on growth of a animal. Especially to find out optimum nutritional needs. • Agricultural economists use statistical based forecasting procedures in order to determine the future demand and supply of a particular crop product. • The incorrect projections regarding the future demand of a crop or crops affect the whole economy. 8
  • 9.
    Agriculture • Agricultural engineersuse statistical procedures in many areas such as irrigation research, modes of cultivations, design of harvesting etc. • Bio-statistics is useful in studying the various agronomic features viz. weight of a fruit, yield of a crop/acre, number of seeds/pod, number of tillers/plant, height of a crop, etc. The estimation of these agronomic features helps to evaluate the performance of different varieties of a crop. • The the sound statistical procedures are needed in agriculture for the design of experiments, as well as in the analysis of data. 9
  • 10.
    Genetics • Biostatistics hasapplications in the qualitative and quantitative genetics. For instance, Gregor Mendel by applying statistical analysis could propose the laws of inheritance. • By applying the probability models, the segregation ratios as observed by Mendel in pea would have been derived. • Biostatistics plays an important role in understanding the concept of polygenic inheritance. Galton (1888) and Karl Pearson (1905) tried to explain polygenic inheritance through correlation and regression studies. 10
  • 11.
    Genetics • Biometry findsapplications in estimating the gene frequency and genotype frequencies. • Chi-square test is the most widely used in the interpreting genetic experiments especially in the detection of single factor ratios, linkages and heterogeneity. • The strength of the linkage is be measured by calculating recombination percentage. 11
  • 12.
    Plant Physiology • Thestrength of a relationship among the various physiological phenomenon and different external and internal factors can be estimated by correlation coefficients. • Regression analysis is more useful when two or more variables control the same phenomenon and to establish correlation. • The statistical methods can be used to estimate the germinability rate of seeds, rate of photosynthesis, productivity rate of a particular alkaloid etc. 12
  • 13.
    Ecology • Statistical methodsare widely used in vegetation studies, especially, in estimating the frequency, abundance, distribution, density of a particular plant species or animal group in a specified geographical area. • Biometry enables us to estimate the biodiversity index of a locality. • Some statistical models are used to forecast the incidences of epidemics, the rate of the accumulation of different pollutants in the ecosystem, etc. 13
  • 14.
    Taxonomy • Traditional methodsare inadequate to classify a plant or animal species in particular taxa due to the insufficient knowledge about the affinity or similarity index. • Numerical taxonomy i.e. the numerical evaluation of the similarity or affinity between the taxonomic units is useful in the classification systems. • The indices like coefficients of association and correlation has many applications in taxonomy. 14
  • 15.
    Molecular Biology • Statisticalmethods are used for calculating the recombination percentage. • Biostatistics is essential in the construction of gene maps. • Biostatistics has application in the determination of sequence of nucleotides, Preparations of cladograms, etc. 15