Intro

Assessing the Representativeness of FADN
The purpose of this evaluation is to assess the representativeness of Farm Accountancy Data Network (FADN) data.
This is achieved, first, by comparing FADN estimates to population averages for 2020 obtained from Integrated Farm Statistics (IFS). The population of interest for FADN includes farms that belong to IFS and have an economic size above a minimum threshold specified by each Member State. This Eurostat-FADN comparison focuses on three indicators: Utilised Agricultural Area (UAA) in hectares, Standard Output (SO) in euro, and Livestock Units (LU).
The dashboard also provides measures of precision of FADN estimates for a large number of FADN indicators. This precision assessment is based solely on FADN data and presents estimates from 2011 to 2020 at various levels of aggregation. Estimates account for the FADN sampling design, and standard errors are obtained using variance linearisation techniques.
The dashboard is divided into several sections:
- Methodology: presents the concept of confidence intervals and introduces key measures used in this assessment, such as means, standard errors, and relative standard errors;
- ESTAT comparison: provides a direct comparison between FADN estimates and IFS population values for the year 2020, along with 95% confidence intervals;
- by Member State: presents FADN estimates with 95% confidence intervals, standard errors, and relative standard errors at the level of each Member State (MS);
- by Type of Farming: provides FADN estimates with 95% confidence intervals, standard errors, and relative standard errors at the level of each of the 8 Types of Farming (TF8);
- by Economic Size: presents FADN estimates with 95% confidence intervals, standard errors, and relative standard errors at the level of each of the 6 Economic Size classes (SIZ6);
- by Region: provides FADN estimates, standard errors, and relative standard errors at the level of each FADN region.
It is important to note that this evaluation focuses solely on sampling errors and does not address potential sources of non-sampling error.
Methodology
Confidence intervals
Confidence intervals provide a means of estimating the range of values that is likely to contain the true population parameter with a specific level of confidence, based on a single sample of data. The formula of a confidence interval for the mean is the following:
$$CI = \overline{x} \pm z_\frac{α}{2} SE(\overline{x})\qquad(1)$$
where \(CI\) represents the confidence interval, \(\overline{x}\) denotes the sample mean, which takes into account the stratified sampling strategy, \(z_\frac{α}{2}\) corresponds to the z-score for the desired confidence level, such as 1.96 for 95% confidence, \(SE(\overline{x})\) is the standard error estimated from survey data and that takes into account the sampling strategy.
Sample means \(\overline{x}\) for FADN take into account the stratified survey design and are computed as:
$$\overline{x} = \frac{\sum_{i} w_i x_i}{\sum_{i}w_i}\qquad(2)$$
where \(w_i\) is the sampling weight for farm \(i\) and \(x_i\) is the indicator.
Confidence intervals and RSE
Confidence intervals of equation (1) are obtained using weighted averages and standard errors estimated from survey data. Standard errors are estimated using variance linearisation techniques provided by the R package survey. Confidence intervals considered in this assessment are defined at 95% confidence level. Results are presented in sheets: by Member State, by Type of Farming, by Economic Size and by Region.
Another measure linked to the concept of confidence intervals and typically used to define uncertainty around survey estimates is the Relative Standard Error (RSE). The formula for the RSE is the following:
$$RSE=100\left|\frac{SE(\overline{x})}{\overline{x}}\right|\qquad(4)$$
RSE specifies the likely percentage deviation that we would observe in the estimates with repeated samples. RSE is strictly linked to the concept of confidence interval when the confidence level is set to approximately to 68% (i.e., \(z_\frac{α}{2}=1\)).
Review of quality reports of other surveys
Table 1 lists a set of reviewed surveys and the way with which their organizing institutions inform about their quality.
Table 1 provides information about whether and which accuracy/precision measures are chosen by the corresponding institutions. Table 1 shows that most of the quality assessments are based on precision analyses. Confidence intervals and relative standard errors are widely used measures to inform about quality of survey data. The main measure used in these assessment being the RSE. In some instances, precision estimates are accompanied by specific precision thresholds used as limits or to inform the user about the reliability of the corresponding survey estimates. Some quality assessments include accuracy measures or specific activities although these are tailored to the specific survey and region considered. The main measure used to communicate about accuracy seems to be (non-)response rate. Most institutions communicate about survey quality using tables with fixed structure and showing a limited set of survey variables and aggregation levels. Other institutions prefer a more flexible approach where the user can specify the aggregation level and the measure to be investigated and then tailor the reporting using a flexible platform. Most surveys present their quality reports at national or regional level. Some provide information at the level of subgroups. Some institutions perform quality assessments yearly.