Survey and Measure Development in R

Survey and Measure Development in R
George Mount
July 31, 2019

Presentation agenda
The “why’s” of survey development & measurement
Applications in R
Resources

Follow along!
rstudio.cloud/project/355799.

Hitting our mark: reliability and validity
There’s more than one way to validate an instrument!

The measure development process

Finally some code!
packages_used <- c("likert","irr",
"lavaan","semTools","psych","semPlot","corrplot",
"tidyverse","GPArotation")
lapply(packages_used, require, character.only = TRUE)

Step 1: Item generation
Content validity: how essential is each item to the concept?
Inter-rater reliability: do the experts agree this is a good item?
Inter-rater reliability: Base agreement
## Rater_A Rater_B
## 1 3 3
## 2 1 1
## 3 2 3
## 4 2 2
## 5 1 1
## 6 3 2
## Percentage agreement (Tolerance=0)
##
## Subjects = 13
## Raters = 2
## %-agree = 61.5
Account for agreement due to chance:
Cohen’s kappa
## Call: cohen.kappa1(x = x, w = w, n.obs = n.obs, alpha = alpha, levels = levels)
##
## Cohen Kappa and Weighted Kappa correlation coefficients and confidence boundaries
## lower estimate upper
## unweighted kappa -0.0056 0.41 0.82
## weighted kappa -0.3086 0.32 0.94
##
## Number of subjects = 13
Weighted kappa interpretation
0 - .39 Poor
# Two experts rate their assessment of items on a scale of 1 to 3
head(experts)
library(irr)
agree(experts)
library(psych)
cohen.kappa(experts)

.4 - .59 Moderate
.6 - .79 Substantial
.8 - 1 Very strong

Step 2: Questionnaire administration
Sampling
(non-)Response bias
Other important things not done in R…

Step 2.5: Exploratory data analysis
Summary statistics
## vars n mean sd median trimmed mad min max range
## easy_to_find 1 250 3.50 0.85 3 3.51 1.48 1 5 4
## quick_transaction 2 250 3.53 0.77 4 3.54 1.48 1 5 4
## quick_load 3 250 3.60 0.80 4 3.60 1.48 1 5 4
## easy_navigate 4 250 3.50 0.78 4 3.50 1.48 1 5 4
## post_purchase 5 250 2.93 0.72 3 2.92 0.00 1 5 4
## prompt_request 6 250 3.37 0.73 3 3.40 1.48 1 5 4
## quick_resolve 7 250 2.99 0.88 3 2.98 1.48 1 5 4
## safe_information 8 250 3.20 0.78 3 3.21 0.00 1 5 4
## no_sell_information 9 250 3.60 0.85 4 3.63 1.48 1 5 4
## safe_purchases 10 250 3.23 0.85 3 3.25 1.48 1 5 4
## good_hours 11 250 2.89 1.36 3 2.86 1.48 1 5 4
## skew kurtosis se
## easy_to_find -0.05 -0.25 0.05
## quick_transaction -0.21 -0.08 0.05
## quick_load -0.36 0.40 0.05
## easy_navigate -0.41 0.84 0.05
## post_purchase -0.02 0.59 0.05
## prompt_request -0.21 0.48 0.05
## quick_resolve 0.09 -0.23 0.06
## safe_information 0.03 0.34 0.05
## no_sell_information -0.57 0.51 0.05
## safe_purchases -0.21 0.16 0.05
## good_hours 0.11 -1.22 0.09
Response frequencies
## 1 2 3 4 5 miss
## easy_to_find 0.8 8.8 41.6 36.8 12.0 0
## quick_transaction 0.4 7.6 38.4 45.6 8.0 0
## quick_load 1.2 5.2 36.4 46.4 10.8 0
## easy_navigate 2.0 4.4 43.2 42.8 7.6 0
## post_purchase 2.4 21.2 59.2 15.6 1.6 0
## prompt_request 1.2 7.2 49.6 37.6 4.4 0
## quick_resolve 3.2 25.2 45.2 22.4 4.0 0
## safe_information 1.6 12.8 54.0 26.8 4.8 0
## no_sell_information 2.0 6.8 32.0 48.0 11.2 0
## safe_purchases 2.8 13.2 47.6 31.2 5.2 0
## good_hours 19.6 23.6 20.8 20.4 15.6 0
Response frequencies – visualization
library(psych)
describe(digital_quality)
# Load our BFF, the psych package
library(psych)
round(response.frequencies(digital_quality),4) * 100

## [1] "It is easy to find information on the website."
## [2] "It is quick to make a transaction on the website."
## [3] "The website loads quickly."
## [4] "The website is easy to navigate."
## [5] "The company provides useful post-purchase support."
## [6] "I receive prompt responses to my requests."
## [7] "The company quickly resolves any problems I enounter."
## [8] "My information is safe."
## [9] "The company won't sell my information."
## [10] "I feel safe making purchases through this website."
## [11] "The website provides good hours. "
Correlations
# select likert items, convert to factors
library(likert)
# Convert items to factors
dig_qual_f <- dig_qual_labelled %>%
mutate_if(is.integer, as.factor)
names(dig_qual_f) <- c(digital_quality_codebook$Item)
result <- likert(dig_qual_f)
plot(result, group.order = names(dig_qual_f))
digital_quality_codebook$Item
lowerCor(digital_quality)

## esy__ qck_t qck_l esy_n pst_p prmp_ qck_r sf_nf n_sl_
## easy_to_find 1.00
## quick_transaction 0.46 1.00
## quick_load 0.43 0.42 1.00
## easy_navigate 0.52 0.42 0.45 1.00
## post_purchase 0.29 0.26 0.29 0.33 1.00
## prompt_request 0.24 0.13 0.20 0.26 0.40 1.00
## quick_resolve 0.25 0.13 0.14 0.26 0.42 0.43 1.00
## safe_information 0.26 0.25 0.30 0.36 0.22 0.23 0.14 1.00
## no_sell_information 0.21 0.15 0.21 0.24 0.09 0.13 0.09 0.61 1.00
## safe_purchases 0.27 0.15 0.34 0.32 0.28 0.25 0.26 0.57 0.51
## good_hours -0.07 0.03 0.11 0.02 0.03 -0.08 -0.05 -0.03 -0.01
## sf_pr
## easy_to_find
## quick_transaction
## quick_load
## easy_navigate
## post_purchase
## prompt_request
## quick_resolve
## safe_information
## no_sell_information
## safe_purchases 1.00
## good_hours 0.00
## [1] 1.00
From correlations to factors
library(corrplot)
corrplot(cor(digital_quality))

Survey and Measure Development in R

Step 3: Initial item reduction
How many factors/dimensions are “reflected” in the data?
Do any items not reflect a factor?

Exploratory factor analysis (EFA) in psych
## [1] "residual" "dof" "chi" "nh"
## [5] "rms" "EPVAL" "crms" "EBIC"
## [9] "ESABIC" "fit" "fit.off" "sd"
## [13] "factors" "complexity" "n.obs" "objective"
## [17] "criteria" "STATISTIC" "PVAL" "Call"
## [21] "null.model" "null.dof" "null.chisq" "TLI"
## [25] "RMSEA" "BIC" "SABIC" "r.scores"
## [29] "R2" "valid" "score.cor" "weights"
## [33] "rotation" "communality" "communalities" "uniquenesses"
## [37] "values" "e.values" "loadings" "model"
## [41] "fm" "rot.mat" "Phi" "Structure"
## [45] "method" "scores" "R2.scores" "r"
## [49] "np.obs" "fn" "Vaccounted"
# factor analysis with presumed number of factors
dig_qual_EFA <- fa(digital_quality, nfactors = 3)
# Names of resulting objects -- look how many!
names(dig_qual_EFA)

EFA: What to look for
1. Factor loadings
Primary factor loading > |.5|
##
## Loadings:
## MR1 MR2 MR3
## quick_transaction 0.721
## quick_load 0.616
## easy_navigate 0.619
## post_purchase 0.207 0.521
## prompt_request 0.619
## quick_resolve 0.713
## safe_information 0.783
## no_sell_information 0.798
## safe_purchases 0.645 0.186
## good_hours -0.102
##
## MR1 MR2 MR3
## SS loadings 1.784 1.690 1.241
## Proportion Var 0.162 0.154 0.113
## Cumulative Var 0.162 0.316 0.429
2. Eigenvalues
Number of eigenvalues > 1 = number of factors (“Kaiser
rule”)
## [1] 3.6866570 1.4604989 1.3048869 1.0227582 0.6078838 0.6008020 0.5802043
## [8] 0.5224980 0.4630395 0.4080585 0.3427130
2a. Scree plot
“Elbow rule:” pick the number of factors “before the dip in
the elbow”
dig_qual_EFA$loadings
# We are right on the brink! What is 'contaminating' our model?
dig_qual_EFA$e.values
scree(digital_quality_10)

3. Factor score correlations
< .6 (related but not too related)
## [,1] [,2] [,3]
## [1,] 1.0000000 0.3938549 0.3404122
## [2,] 0.3938549 1.0000000 0.2665261
## [3,] 0.3404122 0.2665261 1.0000000
dig_qual_EFA$score.cor

Poor results… Now what?
Take another look at our item descriptions…
## var.name
## 1 easy_to_find
## 2 quick_transaction
## 3 quick_load
## 4 easy_navigate
## 5 post_purchase
## 6 prompt_request
## 7 quick_resolve
## 8 safe_information
## 9 no_sell_information
## 10 safe_purchases
## 11 good_hours
## Item
## 1 It is easy to find information on the website.
## 2 It is quick to make a transaction on the website.
## 3 The website loads quickly.
## 4 The website is easy to navigate.
## 5 The company provides useful post-purchase support.
## 6 I receive prompt responses to my requests.
## 7 The company quickly resolves any problems I enounter.
## 8 My information is safe.
## 9 The company won't sell my information.
## 10 I feel safe making purchases through this website.
## 11 The website provides good hours.
Hours for a website?
This is a confusing/misleading item. Drop it.
digital_quality_codebook

## $loadings
##
## Loadings:
## MR1 MR2 MR3
## quick_transaction 0.722
## quick_load 0.594
## easy_navigate 0.616
## prompt_request 0.609
## quick_resolve 0.712
## safe_information 0.782
## no_sell_information 0.797
##
## MR1 MR2 MR3
## SS loadings 1.767 1.692 1.235
## Proportion Var 0.177 0.169 0.123
## Cumulative Var 0.177 0.346 0.469
##
## $e.values
## [1] 3.6866119 1.4603544 1.2763999 0.6408269 0.6018967 0.5885412 0.5226335
## [8] 0.4693499 0.4092372 0.3441483
##
## $score.cor
## [,1] [,2] [,3]
## [1,] 1.0000000 0.3938549 0.3887303
## [2,] 0.3938549 1.0000000 0.2875085
## [3,] 0.3887303 0.2875085 1.0000000
# Drop good_hours from the data set
library(dplyr)
digital_quality_10 <- dplyr::select(digital_quality, -good_hours)
digital_quality_10_EFA <- fa(digital_quality_10, nfactors = 3)
# Re-run our EFA diagnostics
digital_quality_10_EFA[c("loadings","e.values","score.cor")]

Coefficient/Cronbach’s alpha – do similar
items in the survey produce similar
scores?
##
## Reliability analysis
## Call: alpha(x = digital_quality_10)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.81 0.81 0.83 0.29 4.2 0.019 3.3 0.48 0.26
##
## lower alpha upper 95% confidence boundaries
## 0.77 0.81 0.84
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se
## easy_to_find 0.78 0.78 0.80 0.29 3.6 0.021
## quick_transaction 0.79 0.80 0.81 0.30 3.9 0.020
## quick_load 0.78 0.79 0.81 0.29 3.7 0.021
## easy_navigate 0.78 0.78 0.80 0.28 3.5 0.021
## post_purchase 0.79 0.79 0.81 0.30 3.8 0.020
## prompt_request 0.80 0.80 0.82 0.30 3.9 0.020
## quick_resolve 0.80 0.80 0.82 0.31 4.0 0.019
## safe_information 0.78 0.78 0.80 0.29 3.6 0.021
## no_sell_information 0.80 0.80 0.81 0.30 3.9 0.019
## safe_purchases 0.78 0.78 0.80 0.28 3.6 0.021
## var.r med.r
## easy_to_find 0.018 0.26
## quick_transaction 0.017 0.27
## quick_load 0.018 0.26
## easy_navigate 0.018 0.25
## prompt_request 0.018 0.28
## quick_resolve 0.017 0.28
## safe_information 0.015 0.26
## no_sell_information 0.013 0.28
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## easy_to_find 250 0.66 0.65 0.61 0.54 3.5 0.85
## quick_transaction 250 0.55 0.56 0.49 0.43 3.5 0.77
## quick_load 250 0.63 0.63 0.57 0.51 3.6 0.80
## easy_navigate 250 0.69 0.69 0.65 0.59 3.5 0.78
## post_purchase 250 0.58 0.59 0.53 0.47 2.9 0.72
## prompt_request 250 0.53 0.54 0.46 0.41 3.4 0.73
## quick_resolve 250 0.52 0.52 0.44 0.37 3.0 0.88
## safe_information 250 0.66 0.65 0.63 0.55 3.2 0.78
## no_sell_information 250 0.55 0.54 0.49 0.41 3.6 0.85
## safe_purchases 250 0.67 0.66 0.62 0.55 3.2 0.85
##
## Non missing response frequency for each item
## 1 2 3 4 5 miss
## easy_to_find 0.01 0.09 0.42 0.37 0.12 0
## quick_transaction 0.00 0.08 0.38 0.46 0.08 0
# Get Cronbach's alpha from psych
# look for std.alpha
alpha(digital_quality_10)

## quick_load 0.01 0.05 0.36 0.46 0.11 0
## easy_navigate 0.02 0.04 0.43 0.43 0.08 0
## post_purchase 0.02 0.21 0.59 0.16 0.02 0
## prompt_request 0.01 0.07 0.50 0.38 0.04 0
## quick_resolve 0.03 0.25 0.45 0.22 0.04 0
## safe_information 0.02 0.13 0.54 0.27 0.05 0
## no_sell_information 0.02 0.07 0.32 0.48 0.11 0
## safe_purchases 0.03 0.13 0.48 0.31 0.05 0
Cronbach’s alpha interpretation
Look for scores between .7 and .9.
Too high? Multicollinear/drop items…
Too low? Unreliable/drop iems…
Sensitive to number of items

Split-half reliability: are items
contributing equally to measurement?
## Split half reliabilities
## Call: splitHalf(r = digital_quality_10)
##
## Maximum split half reliability (lambda 4) = 0.87
## Guttman lambda 6 = 0.83
## Average split half reliability = 0.81
## Guttman lambda 3 (alpha) = 0.81
## Minimum split half reliability (beta) = 0.67
## Average interitem r = 0.29 with median = 0.26
Split-half reliability interpretation
Look for average > .8.
splitHalf(digital_quality_10)

Plot diagrams: “Every picture tells a
story…”
fa.diagram(digital_quality_10_EFA)

Step 4: Confirmatory Factor Analysis

EFA versus CFA
EFA:
How many factors are needed to explain relationships in the data?
Theory development
CFA:
Does the data match our model?
Hypothesis testing (chi-square)

Introduction to lavaan
la(tent) va(riable) an(alysis)
Use =~ to assign items to factors
Fit the model to the data

Initial CFA
## lavaan 0.6-3 ended normally after 32 iterations
##
## Optimization method NLMINB
## Number of free parameters 23
##
## Number of observations 250
##
## Estimator ML
## Model Fit Test Statistic 46.018
## Degrees of freedom 32
## P-value (Chi-square) 0.052
##
## Model test baseline model:
##
## Minimum Function Test Statistic 730.192
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.980
## Tucker-Lewis Index (TLI) 0.971
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2639.166
## Loglikelihood unrestricted model (H1) -2616.157
##
## Akaike (AIC) 5324.333
## Bayesian (BIC) 5405.326
## Sample-size adjusted Bayesian (BIC) 5332.414
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.042
## 90 Percent Confidence Interval 0.000 0.067
## P-value RMSEA <= 0.05 0.672
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.046
# Rename the variables so they're easier to work with
colnames(digital_quality_10) <- c(paste0("DQ", 1:10))
library(lavaan)
# Define the model
digital_quality_cfa_model <- 'F1 =~ DQ1 + DQ2 + DQ3 + DQ4
F2 =~ DQ5 + DQ6 + DQ7
F3 =~ DQ8 + DQ9 + DQ10'
# fit the model to the data
digital_quality_cfa <- cfa(data = digital_quality_10, model = digital_quality_cfa_model)
# Summarize with standardized loadings, fit mesaures
summary(digital_quality_cfa, standardized = TRUE, fit.measures = TRUE)

##
## Parameter Estimates:
##
## Information Expected
## Information saturated (h1) model Structured
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## F1 =~
## DQ1 1.000 0.602 0.712
## DQ2 0.768 0.095 8.069 0.000 0.462 0.604
## DQ3 0.839 0.100 8.414 0.000 0.505 0.635
## DQ4 0.953 0.102 9.327 0.000 0.574 0.734
## F2 =~
## DQ5 1.000 0.486 0.673
## DQ6 0.946 0.139 6.783 0.000 0.460 0.628
## DQ7 1.130 0.167 6.787 0.000 0.550 0.629
## F3 =~
## DQ8 1.000 0.649 0.831
## DQ9 0.940 0.092 10.172 0.000 0.610 0.719
## DQ10 0.924 0.092 10.101 0.000 0.600 0.711
##
## Covariances:
## F1 ~~
## F2 0.163 0.032 5.090 0.000 0.557 0.557
## F3 0.199 0.037 5.410 0.000 0.511 0.511
## F2 ~~
## F3 0.122 0.030 4.073 0.000 0.386 0.386
##
## Variances:
## .DQ1 0.352 0.043 8.138 0.000 0.352 0.493
## .DQ2 0.371 0.039 9.506 0.000 0.371 0.635
## .DQ3 0.377 0.041 9.203 0.000 0.377 0.597
## .DQ4 0.281 0.036 7.723 0.000 0.281 0.461
## .DQ5 0.286 0.039 7.256 0.000 0.286 0.548
## .DQ6 0.325 0.040 8.092 0.000 0.325 0.606
## .DQ7 0.462 0.057 8.075 0.000 0.462 0.605
## .DQ8 0.189 0.035 5.350 0.000 0.189 0.310
## .DQ9 0.349 0.042 8.239 0.000 0.349 0.484
## .DQ10 0.352 0.042 8.383 0.000 0.352 0.495
## F1 0.362 0.063 5.748 0.000 1.000 1.000
## F2 0.236 0.049 4.795 0.000 1.000 1.000
## F3 0.421 0.060 6.964 0.000 1.000 1.000

Other CFA fit measures (absolute)
Measure Value to look for
Comparative Fit Index (CFI) > .9
Tucker-Lewis Index (TLI) > .9
Root Mean Square Error of Approximation (RMSEA) < .05

Inspect alternative fit measures
## cfi tli rmsea
## 0.97954112 0.97122970 0.04186027
fitMeasures(digital_quality_cfa)[c("cfi","tli","rmsea")]

New fit, new rules
Multivariate normality is assumed in theory…
## Call: mardia(x = digital_quality_10)
##
## Mardia tests of multivariate skew and kurtosis
## Use describe(x) the to get univariate tests
## n.obs = 250 num.vars = 10
## b1p = 10.76 skew = 448.43 with probability = 0
## small sample skew = 454.8 with probability = 0
## b2p = 142.38 kurtosis = 11.42 with probability = 0
Instead, use the mlr estimator for robust
standard errors
mardia(digital_quality_10)
# fit the model to the data
digital_quality_cfa_mlr <- cfa(data = digital_quality_10, model = digital_quality_cfa_model, estimator = 'mlr')
# Summarize with standardized loadings, fit mesaures,
# MLR estimator
summary(digital_quality_cfa_mlr, standardized = TRUE, fit.measures = TRUE)

##
##
##
## Estimator ML Robust
## Model Fit Test Statistic 46.018 42.449
## Degrees of freedom 32 32
## P-value (Chi-square) 0.052 0.102
## Scaling correction factor 1.084
## for the Yuan-Bentler correction (Mplus variant)
##
##
## Minimum Function Test Statistic 730.192 594.765
## Degrees of freedom 45 45
## P-value 0.000 0.000
##
##
## Comparative Fit Index (CFI) 0.980 0.981
## Tucker-Lewis Index (TLI) 0.971 0.973
##
## Robust Comparative Fit Index (CFI) 0.983
## Robust Tucker-Lewis Index (TLI) 0.976
##
##
## Loglikelihood user model (H0) -2639.166 -2639.166
## for the MLR correction
## Loglikelihood unrestricted model (H1) -2616.157 -2616.157
## for the MLR correction
##
## Number of free parameters 23 23
## Akaike (AIC) 5324.333 5324.333
## Bayesian (BIC) 5405.326 5405.326
## Sample-size adjusted Bayesian (BIC) 5332.414 5332.414
##
##
## RMSEA 0.042 0.036
## 90 Percent Confidence Interval 0.000 0.067 0.000 0.062
## P-value RMSEA <= 0.05 0.672 0.791
##
## Robust RMSEA 0.038
##
##
## SRMR 0.046 0.046
##
##
## Information Observed
## Observed information based on Hessian
## Standard Errors Robust.huber.white
##
## F1 =~
## DQ1 1.000 0.602 0.712
## DQ2 0.768 0.108 7.095 0.000 0.462 0.604
## DQ3 0.839 0.154 5.445 0.000 0.505 0.635

## DQ4 0.953 0.106 9.035 0.000 0.574 0.734
## F2 =~
## DQ5 1.000 0.486 0.673
## DQ6 0.946 0.202 4.683 0.000 0.460 0.628
## DQ7 1.130 0.218 5.194 0.000 0.550 0.629
## F3 =~
## DQ8 1.000 0.649 0.831
## DQ9 0.940 0.090 10.498 0.000 0.610 0.719
## DQ10 0.924 0.102 9.047 0.000 0.600 0.711
##
## Covariances:
## F1 ~~
## F2 0.163 0.042 3.848 0.000 0.557 0.557
## F3 0.199 0.041 4.894 0.000 0.511 0.511
## F2 ~~
## F3 0.122 0.036 3.411 0.001 0.386 0.386
##
## Variances:
## .DQ1 0.352 0.043 8.198 0.000 0.352 0.493
## .DQ2 0.371 0.053 7.030 0.000 0.371 0.635
## .DQ3 0.377 0.073 5.171 0.000 0.377 0.597
## .DQ4 0.281 0.043 6.532 0.000 0.281 0.461
## .DQ5 0.286 0.050 5.754 0.000 0.286 0.548
## .DQ6 0.325 0.051 6.361 0.000 0.325 0.606
## .DQ7 0.462 0.071 6.542 0.000 0.462 0.605
## .DQ8 0.189 0.033 5.725 0.000 0.189 0.310
## .DQ9 0.349 0.049 7.075 0.000 0.349 0.484
## .DQ10 0.352 0.059 5.958 0.000 0.352 0.495
## F1 0.362 0.066 5.466 0.000 1.000 1.000
## F2 0.236 0.061 3.851 0.000 1.000 1.000
## F3 0.421 0.065 6.489 0.000 1.000 1.000

Standardized fit measures data frame
Good for slicing/dicing/reporting
## lhs op rhs est.std se z pvalue ci.lower ci.upper
## 1 F1 =~ DQ1 0.712 0.043 16.679 0 0.628 0.796
## 2 F1 =~ DQ2 0.604 0.050 12.191 0 0.507 0.701
## 3 F1 =~ DQ3 0.635 0.048 13.353 0 0.542 0.728
## 4 F1 =~ DQ4 0.734 0.041 17.751 0 0.653 0.815
## 5 F2 =~ DQ5 0.673 0.055 12.233 0 0.565 0.780
## 6 F2 =~ DQ6 0.628 0.056 11.167 0 0.518 0.738
## 7 F2 =~ DQ7 0.629 0.056 11.190 0 0.519 0.739
## 8 F3 =~ DQ8 0.831 0.036 22.851 0 0.759 0.902
## 9 F3 =~ DQ9 0.719 0.041 17.436 0 0.638 0.799
## 10 F3 =~ DQ10 0.711 0.042 17.083 0 0.629 0.793
# =~ are loadings
# ~~ are correlations
dig_qual_df <- standardizedsolution(digital_quality_cfa)
dig_qual_df %>%
filter(op == "=~") %>%
mutate_if(is.numeric, round, 3)

Extracing factor scores
lavPredict() (as in compute)…
Exploring our scores
## vars n mean sd median trimmed mad min max range skew kurtosis
## F1 1 250 0 0.54 0.03 0.00 0.50 -1.63 1.41 3.04 -0.07 0.43
## F2 2 250 0 0.41 0.00 0.00 0.38 -1.17 1.18 2.35 -0.06 -0.15
## F3 3 250 0 0.59 -0.06 0.01 0.51 -2.04 1.53 3.57 -0.16 0.77
## se
## F1 0.03
## F2 0.03
## F3 0.04
dig_qual_factor_scores <- as.data.frame(lavPredict(digital_quality_cfa))
describe(dig_qual_factor_scores)
multi.hist(dig_qual_factor_scores)

SEMghetti and meatballs
library(semPlot)
semPaths(digital_quality_cfa)

Construct validity: are we measuring
what we are claiming to measure?
Convergent: dimensions are similar but not the same.
Average variance extracted (avevar) > .5
Discriminant: dimensions are distinct but not unrelated.
Composite reliability (omega) > .7
## F1 F2 F3 total
## alpha 0.7678713 0.6770723 0.7950026 0.8051184
## omega 0.7686797 0.6757982 0.7952188 0.8571486
## omega2 0.7686797 0.6757982 0.7952188 0.8571486
## omega3 0.7660973 0.6731814 0.7949393 0.8625192
## avevar 0.4563900 0.4113460 0.5644469 0.4780325
Strong model fit and low-ish reliability could mean measurement error.
library(semTools)
reliability(digital_quality_cfa)

Step 5-ish: Concrete validity
New variable: reward_points, points each respondent has accumulated for
six months after taking the survey.
Can our measurement model be used to predict this dependent variable?
i.e., does it possess predictive validity?

Standardized data
rewards_points is not on a scale of one to five.
Standardize all variables with scale().
## vars n mean sd median trimmed mad min max range skew
## DQ1 1 250 0 1 -0.60 0.01 1.75 -2.96 1.77 4.72 -0.05
## DQ2 2 250 0 1 0.61 0.02 1.93 -3.30 1.92 5.22 -0.21
## DQ3 3 250 0 1 0.50 -0.01 1.86 -3.27 1.75 5.02 -0.36
## DQ4 4 250 0 1 0.64 0.01 1.89 -3.19 1.92 5.11 -0.41
## DQ5 5 250 0 1 0.10 -0.01 0.00 -2.66 2.86 5.52 -0.02
## DQ6 6 250 0 1 -0.50 0.04 2.02 -3.23 2.22 5.45 -0.21
## DQ7 7 250 0 1 0.01 -0.01 1.69 -2.27 2.30 4.57 0.09
## DQ8 8 250 0 1 -0.26 0.01 0.00 -2.82 2.29 5.11 0.03
## DQ9 9 250 0 1 0.47 0.04 1.74 -3.05 1.65 4.70 -0.57
## DQ10 10 250 0 1 -0.27 0.03 1.75 -2.64 2.10 4.73 -0.21
## rewards_points 11 250 0 1 -0.01 -0.03 1.07 -3.25 2.87 6.11 0.18
## kurtosis se
## DQ1 -0.25 0.06
## DQ2 -0.08 0.06
## DQ3 0.40 0.06
## DQ4 0.84 0.06
## DQ5 0.59 0.06
## DQ6 0.48 0.06
## DQ7 -0.23 0.06
## DQ8 0.34 0.06
## DQ9 0.51 0.06
## DQ10 0.16 0.06
## rewards_points -0.13 0.06
# Standardize our data and get descriptives
digital_quality_scaled <- scale(digital_quality_rewards)
describe(digital_quality_scaled)

Regression in lavaan.
Just like base R – use ~.
##
##
##
## Estimator ML
## Model Fit Test Statistic 55.095
## P-value (Chi-square) 0.045
##
##
## Minimum Function Test Statistic 997.216
## P-value 0.000
##
##
## Comparative Fit Index (CFI) 0.983
## Tucker-Lewis Index (TLI) 0.976
##
##
## Loglikelihood user model (H0) -3425.509
## Loglikelihood unrestricted model (H1) -3397.962
##
## Akaike (AIC) 6905.018
## Bayesian (BIC) 7000.098
## Sample-size adjusted Bayesian (BIC) 6914.505
##
##
## RMSEA 0.041
## P-value RMSEA <= 0.05 0.721
##
##
## SRMR 0.044
##
# Define the model
digital_quality_reg_model <- 'F1 =~ DQ1 + DQ2 + DQ3 + DQ4
F2 =~ DQ5 + DQ6 + DQ7
F3 =~ DQ8 + DQ9 + DQ10
rewards_points ~ F1 + F2 + F3'
# Fit the model to the data
digital_quality_reg <- sem(data = digital_quality_scaled,
model = digital_quality_reg_model)
# Include r-square in output
summary(digital_quality_reg, standardized = TRUE, fit.measures = TRUE, rsquare = TRUE)

##
## Information Expected
## Information saturated (h1) model Structured
## Standard Errors Standard
##
## F1 =~
## DQ1 1.000 0.687 0.688
## DQ2 0.874 0.107 8.159 0.000 0.601 0.602
## DQ3 0.945 0.108 8.724 0.000 0.650 0.651
## DQ4 1.079 0.112 9.664 0.000 0.741 0.743
## F2 =~
## DQ5 1.000 0.679 0.680
## DQ6 0.923 0.121 7.608 0.000 0.626 0.628
## DQ7 0.910 0.121 7.538 0.000 0.618 0.619
## F3 =~
## DQ8 1.000 0.800 0.801
## DQ9 0.923 0.085 10.912 0.000 0.738 0.740
## DQ10 0.902 0.084 10.697 0.000 0.721 0.723
##
## Regressions:
## rewards_points ~
## F1 0.520 0.104 4.998 0.000 0.357 0.358
## F2 0.551 0.106 5.205 0.000 0.374 0.375
## F3 0.420 0.072 5.834 0.000 0.336 0.337
##
## Covariances:
## F1 ~~
## F2 0.261 0.050 5.187 0.000 0.559 0.559
## F3 0.282 0.053 5.364 0.000 0.514 0.514
## F2 ~~
## F3 0.210 0.051 4.108 0.000 0.387 0.387
##
## Variances:
## .DQ1 0.524 0.059 8.952 0.000 0.524 0.526
## .DQ2 0.635 0.065 9.790 0.000 0.635 0.638
## .DQ3 0.574 0.061 9.372 0.000 0.574 0.576
## .DQ4 0.446 0.055 8.116 0.000 0.446 0.448
## .DQ5 0.535 0.067 7.963 0.000 0.535 0.537
## .DQ6 0.604 0.069 8.783 0.000 0.604 0.606
## .DQ7 0.614 0.069 8.891 0.000 0.614 0.617
## .DQ8 0.357 0.052 6.865 0.000 0.357 0.358
## .DQ9 0.451 0.055 8.255 0.000 0.451 0.453
## .DQ10 0.476 0.056 8.554 0.000 0.476 0.478
## .rewards_points 0.245 0.034 7.215 0.000 0.245 0.246
## F1 0.472 0.083 5.654 0.000 1.000 1.000
## F2 0.461 0.089 5.196 0.000 1.000 1.000
## F3 0.639 0.093 6.894 0.000 1.000 1.000
##
## R-Square:
## Estimate
## DQ1 0.474
## DQ2 0.362
## DQ3 0.424
## DQ4 0.552
## DQ5 0.463
## DQ6 0.394
## DQ7 0.383
## DQ8 0.642
## DQ9 0.547
## DQ10 0.522
## rewards_points 0.754

Plotting our model
rotation = 2 makes diagram “read” from left to right.
whatLabels = std includes standardized coefficients.
# rotation = 2 makes diagram "read" from left to right.
# whatLabels = std includes standardized coefficients.
semPaths(digital_quality_reg, rotation = 2, whatLabels = "std")

Step 6: Replication
Does test hold up among same population over time (test/retest)?
Does test hold up across distinct populations (invariance)?

Test-retest: are scores of the same
respondent the same at T1 and T2?
##
## Descriptive statistics by group
## group: 1
## id 1 70 35.50 20.35 35.5 35.50 25.95 1 70 69 0.00 -1.25
## time 2 70 1.00 0.00 1.0 1.00 0.00 1 1 0 NaN NaN
## DQ1 3 70 1.47 0.91 1.0 1.25 0.00 1 5 4 2.23 4.87
## DQ2 4 70 1.41 0.84 1.0 1.20 0.00 1 5 4 2.27 5.02
## DQ3 5 70 1.50 1.00 1.0 1.25 0.00 1 5 4 2.03 3.29
## DQ4 6 70 2.21 1.19 2.0 2.07 1.48 1 5 4 0.71 -0.42
## DQ5 7 70 3.27 1.45 3.5 3.34 2.22 1 5 4 -0.28 -1.33
## DQ6 8 70 3.06 1.41 3.0 3.07 1.48 1 5 4 -0.04 -1.30
## DQ7 9 70 1.91 1.10 2.0 1.73 1.48 1 5 4 1.13 0.60
## DQ8 10 70 2.23 0.89 2.0 2.16 0.00 1 5 4 0.90 1.16
## DQ9 11 70 2.50 1.05 2.5 2.45 0.74 1 5 4 0.34 -0.36
## DQ10 12 70 2.59 1.03 3.0 2.54 1.48 1 5 4 0.32 -0.08
## se
## id 2.43
## time 0.00
## DQ1 0.11
## DQ2 0.10
## DQ3 0.12
## DQ4 0.14
## DQ5 0.17
## DQ6 0.17
## DQ7 0.13
## DQ8 0.11
## DQ9 0.13
## DQ10 0.12
## --------------------------------------------------------
## group: 2
## id 1 70 35.50 20.35 35.5 35.50 25.95 1 70 69 0.00 -1.25
## time 2 70 2.00 0.00 2.0 2.00 0.00 2 2 0 NaN NaN
## DQ1 3 70 1.61 0.94 1.0 1.43 0.00 1 5 4 1.76 3.07
## DQ2 4 70 1.59 0.96 1.0 1.39 0.00 1 5 4 1.77 2.89
## DQ3 5 70 1.80 1.11 1.0 1.59 0.00 1 5 4 1.45 1.49
## DQ4 6 70 2.13 1.13 2.0 1.98 1.48 1 5 4 0.83 -0.13
## DQ5 7 70 3.27 1.42 3.0 3.34 1.48 1 5 4 -0.30 -1.15
## DQ6 8 70 3.04 1.44 3.0 3.05 1.48 1 5 4 -0.07 -1.32
## DQ7 9 70 2.20 1.17 2.0 2.05 1.48 1 5 4 0.73 -0.32
## DQ8 10 70 2.44 0.86 2.0 2.41 1.48 1 5 4 0.64 0.70
## DQ9 11 70 2.54 0.96 3.0 2.52 1.48 1 5 4 0.22 -0.22
## DQ10 12 70 2.67 0.94 3.0 2.68 1.48 1 5 4 -0.03 -0.16
## se
## id 2.43
## time 0.00
## DQ1 0.11
## DQ2 0.11
## DQ3 0.13
## DQ4 0.13
## DQ5 0.17
## DQ6 0.17
## DQ7 0.14
## DQ8 0.10
## DQ9 0.11
## DQ10 0.11
describeBy(digital_quality_t1_t2, group = "time")

r12: correlation of scaled scores across T1 and T2
Look for scores > .7
## [1] 0.9337308
digital_quality_test_retest <- testRetest(t1 = filter(digital_quality_t1_t2, time == 1),
t2 = filter(digital_quality_t1_t2, time == 2),
id = "id")
digital_quality_test_retest$r12

Resources
For the theory: DeVellis’s Scale development: Theory and applications.
For the R: Dr. Wan Arifin’s page: https://wnarifin.github.io/index.html.
For the marketing applications: Chapman & Feit’s R for Marketing Research and
Analytics.
For the practice: DataCamp’s “Survey and Measurement Development in R”
(by George!)

Thank you
georgejmount.com
linkedin.com/in/gjmount
github.com/summerofgeorge
twitter.com/gjmount

Survey and Measure Development in R

More Related Content

Similar to Survey and Measure Development in R (20)

More from George Mount (20)

Recently uploaded (20)

Survey and Measure Development in R