1
WELCOME
Submitted to:-
Dr. N. V. Soni
Assistant Professor, Dept. of GPB
SDAU, S.K. Nagar
Presented by:-
Vaghela Gauravrajsinh K
M.Sc. (Agri.)
Reg.no:-04-AGRMA-01840-2018
SDAU, S.K. Nagar
Concepts of Correlation and Path Analysis
Correlation Analysis
The term coefficient of correlation was first used by Karl Pearson in 1902.
Correlation is termed as “ The statistics which measure the degree and direction of
association between two or more variables is known as Correlation”.
Correlation is denoted by “r”.
A positive value of r indicates that changes in two variable are associated with high
values of other and vice versa.
When r is negative, the changes are in opposite directions i.e. High values of one
variable are associated with low values of other.
Properties of Correlation Coefficient
1. It is independent unit of measurement.
2. It value lies between -1 to +1.
3. It measures the degree and direction of association between two or more
variables.
4. Correlations are of three types, viz., Simple, Partial and Multiple.
5. Analysis of correlations coefficients involves second order statistics. It is
based on estimates of variance and co-variance.
CONT.
6. Simple, partial and multiple correlation can be estimated from both replicated
and un-replicated data.
7. Simple correlation coefficient are of three types, viz., Genotypic, Phenotypic
and Environmental. Estimation of these three types of correlation coefficient
is possible from replicated data only.
8. In plant breeding, estimates of three types of correlation coefficient, viz.,
Genotypic, Phenotypic and Environmental are commonly used.
9. In plant breeding, estimates of correlation coefficient are useful in determining
yield components which can be used for genetic improvement of yield.
Types of Correlation
1. Simple Correlation
2. Partial Correlation
3. Multiple Correlation
Simple Correlation
Simple correlation refers to the association between two variable.
It is known as total correlation or zero order correlation coefficient.
Features of Simple Correlation
1. It involves two variables.
2. It is denoted as r12.
3. It ignores effects of other independent variables.
4. It is estimated from variances and co-variances.
5. Its value is always lower than multiple correlation.
6. It is of three types, viz., genotypic, phenotypic and environmental.
7. It can be calculated from un-replicated data also.
Partial Correlation
Partial correlation refers to the correlation between two variables eliminating
the effect of third variable.
It is also known as net correlation.
Features of Partial Correlation
1. It is denoted as r12.3.
2. It Involves three or four variables.
3. It does not ignores effects of other independent variables.
4. It is estimated from simple correlations.
5. Its value is always lower than multiple correlation.
6. It is of two types, viz., first order and second order.
7. It can be calculated from un-replicated data also.
Multiple Correlation
Multiple correlation refers to joint influence of two or more independent
variable on a dependent variable.
Features of Multiple Correlation
1. It involves several variables.
2. It is denoted as R1.23.
3. It does not ignore effects of other independent variables.
4. It is estimated from simple correlations.
5. Its value is always higher than simple and partial correlation.
6. It is of one type only.
7. It provides estimates of coefficient of determination.
8. It is a non-negative estimates.
9. It can be calculated from un-replicated data also.
Sr.
No
Particulars Simple
Correlation
Partial
Correlation
Multiple
Correlation
1 Variable involved Two Three or Four Several
2 Denoted as r12 r12.3 R1.23
3 Types Three Two,
First and Second
order
One only
4 Estimated From Variances & co-
variances
Simple correlation Simple correlation
5 Estimation of coefficient
of determination
Not possible Not possible Possible
6 Value is always Lower than R Lower than R Higher r
7 Sign of value Negative or
Positive
Negative or Positive Always Positive
Genotypic Correlation
The inherent or heritable association between two variables is referred to as
genotypic or genetic correlation.
Genotypic correlation is more stable and is of paramount importance to the
plant breeder to bring about genetic improvement in one character by selecting
other character of pair that is genetically correlated.
Features of Genotypic Correlation
1. It involves two variables.
2. It is denoted as rg12.
3. It is estimated from genotypic variances and co-variances.
4. Its value is always lower than multiple correlation.
5. It is of one type only.
6. Its estimation is possible from replicated data only.
7. It is due to linkage or pleiotropy or both.
If the association between two variables remains the same in parental
population as well as segregating population, it is due to pleiotropy.
If changes in segregating population, it most likely due to linkage between two
genes which has broken in segregating population resulting in recombination
between such genes.
The pleiotropy or linkage may involve two desirable traits or one desirable and
one undesirable trait.
The first situation enhances the genetic improvement, whereas the second
hinders the progress.
Phenotypic Correlation
The observable correlation between two variables is called phenotypic correlation.
It includes both genotypic and environmental effects and, therefore, differs under different
environmental conditions.
Features of Phenotypic Correlation
1. It involves two variables.
2. It is denoted as rph12.
3. It is estimated from phenotypic variances and co-variances.
4. Its value is always lower than multiple correlation.
5. It is of one type only.
6. Its estimation is possible from replicated data only.
7. It include both genotypic and environmental effects.
8. It is less stable than genotypic correlation.
Environmental Correlation
The correlation between variables which is entirely due to environmental effects is called
environmental correlation.
It is due to error variance and, therefore, differs under different environmental conditions.
Features of Environmental Correlation
1. It involves two variables.
2. It is denoted as re12.
3. It is estimated from environmental variances and co-variances.
4. Its value is always lower than multiple correlation.
5. It is of one type only.
6. Its estimation is possible from replicated data only.
7. It include environmental effects.
8. It is not stable.
Sr.
No
Particulars Genotypic
Correlation
Phenotypic
Correlation
Environmental
Correlation
1 Variables
Involved
Two Two Two
2 Denoted as rg rph re
3 Variances and Co-
variances used for
estimation
Genotypic Phenotypic Environmental
4 Stability Stable Less Stable Not Stable
Comparison of Genotypic, Phenotypic and
Environmental Correlation
Scales for Correlation coefficients [Searle, 1965]
Sr. No Values of Correlation Coefficient Rate or Scale
1 >0.65 Very Strong
2 0.50 to 0.64 Moderately strong
3 0.30 to 0.49 Moderately weak
4 <0.30 Very weak
Path Coefficient Analysis
Path analysis was originally developed by Wright in 1921.
But the technique was first used for plant selection by Dewey and Lu in 1959.
Path coefficient analysis is simply a standardized partial regression
coefficient which splits the correlation coefficient into the measures of direct
and indirect effects.
For example, in black gram grain yield (X4) is affected by primary
number of branches (X1), secondary braches (X2) and pods per plant
(X3).
Main features of Path Coefficient Analysis
1. Path analysis measures the cause of association between two variables.
2. Analysis of path coefficient is based on all possible simple correlations among
various characters.
3. It provides information about direct and indirect effects of independent
variable on dependent variable.
4. Analysis is based on the assumptions linearity and additivity.
5. It also estimates residual effects.
6. Path analysis helps in determining yield contributing characters and thus is
useful in indirect selection.
Types of Path Coefficient
Path coefficient analysis can be carried out from both un-replicated and replicated
data. From un-replicated data only simple path can be worked out. Path
coefficients that are worked out from replicated data are of three types, viz.,
phenotypic, genotypic and environmental. These are briefly discussed below:-
1. Phenotypic Path:- The phenotypic path is worked from phenotypic correlation
coefficients. Phenotypic path splits the phenotypic correlation coefficient into
the measures of direct and indirect effects.
2. Genotypic Path:- Path coefficients which are worked out from genotypic
correlation coefficients are referred to as genotypic paths. It splits the
genotypic correlation coefficient into the measures of direct and indirect
contribution of various independent characters towards a dependent character
say yield in plant breeding experiments.
3. Environmental Path:- Path coefficient which is worked from the estimates
of environmental correlation coefficient is referred to as environmental path.
It is worked out from all possible environmental correlation coefficients
among various characters included in the study.
Comparison of Correlation and Path coefficient Analysis
Sr.
No
Particulars Correlation Analysis Path Analysis
1 Analysis is based on Degree and Co-variance Path analysis
2 Provide information Degree and direction of
association
Direct and indirect
effects of correlation
3 Types Three (Simple, Partial and
Multiple)
Three (Genotypic,
Phenotypic and
Environmental)
4 Based on assumptions Linearity and additivity Linearity and additivity
5 Residual effect Not measured Measured
6 Determines Yield components Yield components
COMPUTATION OF PATH COEFFICIENTS
Path coefficient analysis consist of following important steps:
1. Selection of Genotypes
2. Evaluation of Material
3. Statistical Analysis: -
Estimation of variances and covariance's for all characters and their
combinations, respectively.
Calculation of all possible simple correlation coefficients among various
characters included in the study, which is equal to n (n-1)/2, where n is number
of variables (Characters).
Then path analysis is carried out according to the procedure described by Dewey
and Lu (1959). It consists of three steps, viz., calculation of (a) direct effects, (b)
indirect effects, and (c) residual effects.
Path Diagram
 In path analysis, a line diagram which is constructed with the help of simple
correlation coefficient among various characters included under study is
refereed to as path diagram.
This diagram is useful in several ways as given below.
 It depicts the cause and effect situation in a simple manner and makes the
presentation of results more attractive. In other words, it provides a visual
picture of cause and effect situation.
 It also depicts the association between various characters.
 It helps in understanding the direct and indirect contribution of various
independent variables towards a dependent variable.
 It helps in setting up of simultaneous equations which are used for the
estimation of direct effects.
A path diagram constructed using the correlation coefficient among yield
(X4) and four of its component traits in black gram. R denotes the residual
effect.
r13
Direct Effects
Every component character will have a direct effect on yield. In
additional, it will also exert indirect effect via other component
characters. The direct effect or contribution of various casual factors is
estimated by solving the simultaneous equations, after putting the values
of simple correlation coefficients. In other words, the estimates of direct
effects, viz., the values of P14, P24 and P34 are obtained.
Indirect effects
The effects of an independent character on the dependent one via other independent
traits are known as indirect effects. These effects are computed by putting the values
of correlation coefficients and those of direct effects as follows.
 Indirect effect of primary branches (X1) via
• Secondary branches (X2) = r12. P24
• Pods per plant (X3) = r13. P34
• Seeds per pod (X4) = r14. P44
 Similarly, the indirect effects of secondary branches (X2) will be as follows
• Via primary branches (X1) = r12.P14
• Via pods plant (X3) = r23.P34
• Via seeds per pod (X4) = r24. P44
 The indirect effects of other components traits, e.g., pods per plant (X3) may be
computed in a similar fashion
Residual Effect
In plant breeding, it is very difficult to have complete knowledge of all
component traits of yield. The residual effect permits precise explanation
about the pattern of interaction of other possible components of yield. In
other words, residual effect measures the role of other possible independent
variables which were not included in the study on the dependent variable.
The residual effect is estimated with the help of direct effects and simple
correlation coefficients as given below.
1= P2 R4 + P14.r14 + P24.r24+P34.r34
1= P2 Ry + ∑ Piy.riy
• Where P2 R4 is the square of residual effect.
APPLICATION IN CROP IMPROVEMENT
 The information obtained by this technique helps in indirect selection for
genetic improvement of yield.
 Selection for a component trait with a view to improve yield is called
indirect selection, while selection for yield per se is termed as direct
selection.
 Searle (1965) has given the minimum combination of heritability and
correlation coefficient values necessary for indirect selection to be more
efficient than direct selection for yield.
Merits
Path analysis provides information about the cause and effect situation and
helps in understanding the cause of association between variables.
It is quite possible that a trait showing positive direct effect on yield may have
a negative indirect effect via other component traits. Path analysis permits the
examination of direct effects of various characters on yield as well as their
indirect effects via other component traits. Thus through the estimates of direct
and indirect effects, it determines the yield components.
It provides basis for selection of superior genotypes from the diverse breeding
populations.
Demerits
 Path analysis is designed to deal with variables having additive effects. Its
application to variable having non-additive effects may lead to wrong
results (Kempthrone, 1957).
 Its computation is some what difficult and inclusion of many variable
makes the computation more complicated.
REFERENCE
Singh, Phundan and Narayanan, S. S. (2013) correlation analysis and path
analysis in Biometrical techniques in plant breeding (3rded). pp: 33-48.
Dabholkar, A. R. (1998) path analysis in elements of biometrical genetics
(1st ed.) pp: 422-439.

Concepts of Correlation and Path Analysis

  • 1.
  • 2.
    Submitted to:- Dr. N.V. Soni Assistant Professor, Dept. of GPB SDAU, S.K. Nagar Presented by:- Vaghela Gauravrajsinh K M.Sc. (Agri.) Reg.no:-04-AGRMA-01840-2018 SDAU, S.K. Nagar Concepts of Correlation and Path Analysis
  • 3.
    Correlation Analysis The termcoefficient of correlation was first used by Karl Pearson in 1902. Correlation is termed as “ The statistics which measure the degree and direction of association between two or more variables is known as Correlation”. Correlation is denoted by “r”. A positive value of r indicates that changes in two variable are associated with high values of other and vice versa. When r is negative, the changes are in opposite directions i.e. High values of one variable are associated with low values of other.
  • 4.
    Properties of CorrelationCoefficient 1. It is independent unit of measurement. 2. It value lies between -1 to +1. 3. It measures the degree and direction of association between two or more variables. 4. Correlations are of three types, viz., Simple, Partial and Multiple. 5. Analysis of correlations coefficients involves second order statistics. It is based on estimates of variance and co-variance. CONT.
  • 5.
    6. Simple, partialand multiple correlation can be estimated from both replicated and un-replicated data. 7. Simple correlation coefficient are of three types, viz., Genotypic, Phenotypic and Environmental. Estimation of these three types of correlation coefficient is possible from replicated data only. 8. In plant breeding, estimates of three types of correlation coefficient, viz., Genotypic, Phenotypic and Environmental are commonly used. 9. In plant breeding, estimates of correlation coefficient are useful in determining yield components which can be used for genetic improvement of yield.
  • 6.
    Types of Correlation 1.Simple Correlation 2. Partial Correlation 3. Multiple Correlation
  • 7.
    Simple Correlation Simple correlationrefers to the association between two variable. It is known as total correlation or zero order correlation coefficient. Features of Simple Correlation 1. It involves two variables. 2. It is denoted as r12. 3. It ignores effects of other independent variables. 4. It is estimated from variances and co-variances. 5. Its value is always lower than multiple correlation. 6. It is of three types, viz., genotypic, phenotypic and environmental. 7. It can be calculated from un-replicated data also.
  • 8.
    Partial Correlation Partial correlationrefers to the correlation between two variables eliminating the effect of third variable. It is also known as net correlation. Features of Partial Correlation 1. It is denoted as r12.3. 2. It Involves three or four variables. 3. It does not ignores effects of other independent variables. 4. It is estimated from simple correlations. 5. Its value is always lower than multiple correlation. 6. It is of two types, viz., first order and second order. 7. It can be calculated from un-replicated data also.
  • 9.
    Multiple Correlation Multiple correlationrefers to joint influence of two or more independent variable on a dependent variable. Features of Multiple Correlation 1. It involves several variables. 2. It is denoted as R1.23. 3. It does not ignore effects of other independent variables. 4. It is estimated from simple correlations. 5. Its value is always higher than simple and partial correlation. 6. It is of one type only. 7. It provides estimates of coefficient of determination. 8. It is a non-negative estimates. 9. It can be calculated from un-replicated data also.
  • 10.
    Sr. No Particulars Simple Correlation Partial Correlation Multiple Correlation 1 Variableinvolved Two Three or Four Several 2 Denoted as r12 r12.3 R1.23 3 Types Three Two, First and Second order One only 4 Estimated From Variances & co- variances Simple correlation Simple correlation 5 Estimation of coefficient of determination Not possible Not possible Possible 6 Value is always Lower than R Lower than R Higher r 7 Sign of value Negative or Positive Negative or Positive Always Positive
  • 11.
    Genotypic Correlation The inherentor heritable association between two variables is referred to as genotypic or genetic correlation. Genotypic correlation is more stable and is of paramount importance to the plant breeder to bring about genetic improvement in one character by selecting other character of pair that is genetically correlated.
  • 12.
    Features of GenotypicCorrelation 1. It involves two variables. 2. It is denoted as rg12. 3. It is estimated from genotypic variances and co-variances. 4. Its value is always lower than multiple correlation. 5. It is of one type only. 6. Its estimation is possible from replicated data only. 7. It is due to linkage or pleiotropy or both.
  • 13.
    If the associationbetween two variables remains the same in parental population as well as segregating population, it is due to pleiotropy. If changes in segregating population, it most likely due to linkage between two genes which has broken in segregating population resulting in recombination between such genes. The pleiotropy or linkage may involve two desirable traits or one desirable and one undesirable trait. The first situation enhances the genetic improvement, whereas the second hinders the progress.
  • 14.
    Phenotypic Correlation The observablecorrelation between two variables is called phenotypic correlation. It includes both genotypic and environmental effects and, therefore, differs under different environmental conditions. Features of Phenotypic Correlation 1. It involves two variables. 2. It is denoted as rph12. 3. It is estimated from phenotypic variances and co-variances. 4. Its value is always lower than multiple correlation. 5. It is of one type only. 6. Its estimation is possible from replicated data only. 7. It include both genotypic and environmental effects. 8. It is less stable than genotypic correlation.
  • 15.
    Environmental Correlation The correlationbetween variables which is entirely due to environmental effects is called environmental correlation. It is due to error variance and, therefore, differs under different environmental conditions. Features of Environmental Correlation 1. It involves two variables. 2. It is denoted as re12. 3. It is estimated from environmental variances and co-variances. 4. Its value is always lower than multiple correlation. 5. It is of one type only. 6. Its estimation is possible from replicated data only. 7. It include environmental effects. 8. It is not stable.
  • 16.
    Sr. No Particulars Genotypic Correlation Phenotypic Correlation Environmental Correlation 1 Variables Involved TwoTwo Two 2 Denoted as rg rph re 3 Variances and Co- variances used for estimation Genotypic Phenotypic Environmental 4 Stability Stable Less Stable Not Stable Comparison of Genotypic, Phenotypic and Environmental Correlation
  • 17.
    Scales for Correlationcoefficients [Searle, 1965] Sr. No Values of Correlation Coefficient Rate or Scale 1 >0.65 Very Strong 2 0.50 to 0.64 Moderately strong 3 0.30 to 0.49 Moderately weak 4 <0.30 Very weak
  • 18.
    Path Coefficient Analysis Pathanalysis was originally developed by Wright in 1921. But the technique was first used for plant selection by Dewey and Lu in 1959. Path coefficient analysis is simply a standardized partial regression coefficient which splits the correlation coefficient into the measures of direct and indirect effects. For example, in black gram grain yield (X4) is affected by primary number of branches (X1), secondary braches (X2) and pods per plant (X3).
  • 19.
    Main features ofPath Coefficient Analysis 1. Path analysis measures the cause of association between two variables. 2. Analysis of path coefficient is based on all possible simple correlations among various characters. 3. It provides information about direct and indirect effects of independent variable on dependent variable. 4. Analysis is based on the assumptions linearity and additivity. 5. It also estimates residual effects. 6. Path analysis helps in determining yield contributing characters and thus is useful in indirect selection.
  • 20.
    Types of PathCoefficient Path coefficient analysis can be carried out from both un-replicated and replicated data. From un-replicated data only simple path can be worked out. Path coefficients that are worked out from replicated data are of three types, viz., phenotypic, genotypic and environmental. These are briefly discussed below:- 1. Phenotypic Path:- The phenotypic path is worked from phenotypic correlation coefficients. Phenotypic path splits the phenotypic correlation coefficient into the measures of direct and indirect effects.
  • 21.
    2. Genotypic Path:-Path coefficients which are worked out from genotypic correlation coefficients are referred to as genotypic paths. It splits the genotypic correlation coefficient into the measures of direct and indirect contribution of various independent characters towards a dependent character say yield in plant breeding experiments. 3. Environmental Path:- Path coefficient which is worked from the estimates of environmental correlation coefficient is referred to as environmental path. It is worked out from all possible environmental correlation coefficients among various characters included in the study.
  • 22.
    Comparison of Correlationand Path coefficient Analysis Sr. No Particulars Correlation Analysis Path Analysis 1 Analysis is based on Degree and Co-variance Path analysis 2 Provide information Degree and direction of association Direct and indirect effects of correlation 3 Types Three (Simple, Partial and Multiple) Three (Genotypic, Phenotypic and Environmental) 4 Based on assumptions Linearity and additivity Linearity and additivity 5 Residual effect Not measured Measured 6 Determines Yield components Yield components
  • 23.
    COMPUTATION OF PATHCOEFFICIENTS Path coefficient analysis consist of following important steps: 1. Selection of Genotypes 2. Evaluation of Material 3. Statistical Analysis: - Estimation of variances and covariance's for all characters and their combinations, respectively. Calculation of all possible simple correlation coefficients among various characters included in the study, which is equal to n (n-1)/2, where n is number of variables (Characters). Then path analysis is carried out according to the procedure described by Dewey and Lu (1959). It consists of three steps, viz., calculation of (a) direct effects, (b) indirect effects, and (c) residual effects.
  • 24.
    Path Diagram  Inpath analysis, a line diagram which is constructed with the help of simple correlation coefficient among various characters included under study is refereed to as path diagram. This diagram is useful in several ways as given below.  It depicts the cause and effect situation in a simple manner and makes the presentation of results more attractive. In other words, it provides a visual picture of cause and effect situation.  It also depicts the association between various characters.  It helps in understanding the direct and indirect contribution of various independent variables towards a dependent variable.  It helps in setting up of simultaneous equations which are used for the estimation of direct effects.
  • 25.
    A path diagramconstructed using the correlation coefficient among yield (X4) and four of its component traits in black gram. R denotes the residual effect. r13
  • 26.
    Direct Effects Every componentcharacter will have a direct effect on yield. In additional, it will also exert indirect effect via other component characters. The direct effect or contribution of various casual factors is estimated by solving the simultaneous equations, after putting the values of simple correlation coefficients. In other words, the estimates of direct effects, viz., the values of P14, P24 and P34 are obtained.
  • 27.
    Indirect effects The effectsof an independent character on the dependent one via other independent traits are known as indirect effects. These effects are computed by putting the values of correlation coefficients and those of direct effects as follows.  Indirect effect of primary branches (X1) via • Secondary branches (X2) = r12. P24 • Pods per plant (X3) = r13. P34 • Seeds per pod (X4) = r14. P44  Similarly, the indirect effects of secondary branches (X2) will be as follows • Via primary branches (X1) = r12.P14 • Via pods plant (X3) = r23.P34 • Via seeds per pod (X4) = r24. P44  The indirect effects of other components traits, e.g., pods per plant (X3) may be computed in a similar fashion
  • 28.
    Residual Effect In plantbreeding, it is very difficult to have complete knowledge of all component traits of yield. The residual effect permits precise explanation about the pattern of interaction of other possible components of yield. In other words, residual effect measures the role of other possible independent variables which were not included in the study on the dependent variable. The residual effect is estimated with the help of direct effects and simple correlation coefficients as given below. 1= P2 R4 + P14.r14 + P24.r24+P34.r34 1= P2 Ry + ∑ Piy.riy • Where P2 R4 is the square of residual effect.
  • 29.
    APPLICATION IN CROPIMPROVEMENT  The information obtained by this technique helps in indirect selection for genetic improvement of yield.  Selection for a component trait with a view to improve yield is called indirect selection, while selection for yield per se is termed as direct selection.  Searle (1965) has given the minimum combination of heritability and correlation coefficient values necessary for indirect selection to be more efficient than direct selection for yield.
  • 30.
    Merits Path analysis providesinformation about the cause and effect situation and helps in understanding the cause of association between variables. It is quite possible that a trait showing positive direct effect on yield may have a negative indirect effect via other component traits. Path analysis permits the examination of direct effects of various characters on yield as well as their indirect effects via other component traits. Thus through the estimates of direct and indirect effects, it determines the yield components. It provides basis for selection of superior genotypes from the diverse breeding populations.
  • 31.
    Demerits  Path analysisis designed to deal with variables having additive effects. Its application to variable having non-additive effects may lead to wrong results (Kempthrone, 1957).  Its computation is some what difficult and inclusion of many variable makes the computation more complicated.
  • 35.
    REFERENCE Singh, Phundan andNarayanan, S. S. (2013) correlation analysis and path analysis in Biometrical techniques in plant breeding (3rded). pp: 33-48. Dabholkar, A. R. (1998) path analysis in elements of biometrical genetics (1st ed.) pp: 422-439.