Beginning with SEM in lavaan II

This post continues the getting started with structural equation modeling series inspired by Dawn Iacobucci’s article: Everything You Always Wanted to Know About SEM (Structural Equations Modeling) But Were Afraid to Ask. In the series, I translate Iacobucci’s LISREL syntax into R lavaan’s.

Note that this post builds upon the previous one. Therefore you will need to execute all the commands in that post for the commands below to work. Besides from lavaan I use the R packages xtable and semplot to create a better looking output. On GitHub, you can find the complete R file which you can source.

The structural model

In lavaan, the structural model can be specified regressing the predictors on the dependent variable(s) using the tilde (‘~’) operator. Iacobucci creates the paths among constructs by relating some constructs into other constructs. In other words, the latent variables created by the confirmatory factor analysis, are regressed into some of the other latent variables.

Iacobucci’s model tested the contribution on:

path description formula
perception of value by cost and quality value ~ cost + quality
costumer satisfaction by quality and value csat ~ quality + value
intention to repurchase by cost and satisfaction repeaT ~ csat + cost

In the lavaan’s model syntax below I specified the structural equations to match the order of Iacobucci’s output. This will ease the comparison of the output between lavaan and Iacobucci’s paper later on. Specifying the model syntax as I did in the table above yields identical results, but the order of the entries in the output is inconsistent with Iacobucci’s report.

semStx <- "
repeaT ~ csat 
value ~ quality 
csat ~ quality
value ~ cost
repeaT ~ cost
csat ~ value 
"

Now we can fit the model with the sem() function. Iacobucci uses the table of factor intercorrelations as input data for fitting the model. As we saw here, factor intercorrelations can be extracted passing “cor.lv” as an argument to inspect(). This can be passed directly to the sample.cov argument as I show below:

semFit <- sem(semStx, sample.cov=inspect(mdlFit, "cor.lv"), 
              sample.nobs=n, std.lv=FALSE)

This time, to ease printing the output of the model fit in 'Iacobucci's style' I wrote the function iacobucciOutput(). This is a ‘convenience’ function, which wraps the commands to print the output, it is handy when the same commands are called with different inputs arguments. The function requires as input argument the model fit object created by a call to the lavaan’s functions sem or cfa. The function prints degrees of freedom, chi squared and associated p value, CFI and SRMR indexes.

iacobucciOutput <- function(mdlFit){
# this function wraps the commands to print the output a-la Iacobucci.
# Its only function is to avoid typing the same commands each time a 
# model's needs to be printed.
    print(paste(
        "chi^2_{", fitMeasures(mdlFit, "df"), 
        "} =", format(fitMeasures(mdlFit, "chisq"), nsmall = 2, digits = 2), 
        ", (p =", format(fitMeasures(mdlFit, "pvalue"), digits = 1), 
        "), CFI =", format(fitMeasures(mdlFit, "cfi"), digits = 2), 
        ", SRMR =", format(fitMeasures(mdlFit, "srmr"), digits = 2)
    ))
}
iacobucciOutput(mdlFit)

A quick glance to the values I got and the ones of Iacobucci show small differences, which do not cause concern.

Output extraction

The results of the path analysis can be extracted using the parameterEstimates() function. parameterEstimates() will format the results into a data.frame-like table. However, I wanted the output to look similar to Iacobucci’s, who placed a directional arrow between the construct and the construct predicting it. Moreover, Iacobucci had a star superscripted to the significant paths. Writing the output to the R console in such a way proved to require much more typing than expected (and I hate typing), so I thought of representing the output into a table. The R package xtable comes in handy to structure the output into a table:

# assign estimate values to a data.frame
estVals <- parameterEstimates(semFit)
# organize output into a table
require(xtable)
xtab2 ",
  # 3 the construct used as predictors
    e = estVals[estVals$op == '~', 'lhs'],
  # 4 the estimate of the path with a star if significant
    vals = paste0(
      format(estVals[estVals$op == '~', 'est'], digits = 1), 
      ifelse(estVals[estVals$op == '~', 'pvalue'] < .5, '*', ''))
  ))
print(xtab2, include.rownames = FALSE)

The coefficients I obtained for each path are also quite similar to Iacobucci's. This implies that lavaan and LISREL perform analyses which are pretty consistent and coherent. Moreover, we replicated Iacobucci's results, and replication is always reasurring. And useful!

The path diagram

The analysis of the structural model feels incomplete without inspection of the path diagram. A call to semPaths with layout = 'tree2' (the default 'tree' had to much overlap among labels and estimates) temporarily soothe my cravings for inspection.

semPaths(semFit, whatLabels = 'std', nCharNodes = 0, 
         layout = 'tree2', edge.label.cex = 1.2)

To increase readability I also increased the font size of the labels by specifying the edge labels to be 1.2 times bigger than the default. But the effect of this out-of-the-box fix lasts shortly: I wanted my plot to look like Iacobucci’s, or better. Customization is due!

For starters, the layout needs rearranging. To be able to figure out the layout I need to determine what attribute regulates it. As many other plotting functions in R, the plot produced by semPaths can be saved to an R object. This allows me to access all the plot’s attributes and modify them as I wish. Below I assign the plot specifications to the variable mdlPlot.

mdlPlot <- semPaths(semFit, whatLabels = 'std', 
   nCharNodes = 0, layout = 'tree2', edge.label.cex = 1.2)

A call to names(mdlPlot) shows a lists of objects which can be accessed as if mdlPlot was an R data.frame. For example, mdlPlot$layout shows the layout used to plot the path diagram, but it is a matrix of values and it looks a bit cryptic in its raw essence. The matrix has as many rows as variables, and 2 columns. Each column contains respectively the x and y coordinates for each variable plotted in the graph. The values range between -1 and 1. The positioning of objects follows the Cartesian coordinate system in which (0,0) is the center (i.e., origin), negative numbers represent a position on the bottom or left side of the graph and positive values represent locations on the upper or right side of the graph.

Given this, it is easy to alter the layout of the graph once we know which coordinates belongs to which variable. We can find the labels of the variables by calling Arguments$label of mdlPlot as in mdlPlot$Arguments$label . The names in the labels’ list are ordered consistently with the row of the matrix containing the x-y coordinates of the plotted variables. Therefore, we now know which coordinates belong to which box in the plot.

For example, the first row in the matrix returned by mdlPlot$layout contains x (1) and y (-1) coordinates for the repeaT variable. In the graph produced by semPlot, RepeaT is printed in the lower (-1) right (1) corner. However, Iacobucci sets the repeaT variable in the centered-rightmost position in the path diagram. Repositioning our repeaT to Iacobucci’s position could be achieved by specifying 1 for the x coordinate (e.g., right), and 0 for the y coordinate (e.g. center). The matrix below recreates Iacobucci’s layout:

ly<-matrix(c(1, 0,
   -0.33, 0,
    0.33, 0,
   -1,    1,
   -1,   -1), ncol=2, byrow=TRUE)
# which must be passed through the layout parameter below 
semPaths(semFit, whatLabels = 'std', nCharNodes = 0, layout = ly, 
   edge.label.cex = 1.2, residuals=FALSE, sizeMan=8)

Yes, OKAY, thank you! The arrows are not identical to Iacobucci's. BUT the path diagram looks pretty good nonetheless. Plus a title can be easily included and exporting to pdf is a piece of cake (if you are not obese or watching your line – to make that comment more accommodating I should have said `it is like taking a breath of air'; which works for breatharians too). Oh, I forgot, I removed the residuals from the plot with residuals=FALSE because they overlapped with some of the estimates. I also increased the readability of the manifest nodes by setting a larger width with sizeMan=8 (default is 5). And on the topic of not forgetting, let’s not forget colors. semPaths can color your day by making colored paths, simply by setting the what parameter. For example:

semPaths(semFit, what = "std",layout=ly, residuals = FALSE, 
   nCharNodes = 0, edge.label.cex = 1.2, sizeMan = 8)

creates a path diagram with green paths for positive estimates and red otherwise. Moreover, the line thickness represents the strength of the relationship. Pretty awesome! And of course, customizable!

DIAGNOSTICS

As in the previous post there are some diagnostics to determine whether the model that we are fitting is statistically sound:

Fit Indexes Values
– CFI > 0.97, SRMR < 0.05, RMSEA < 0.05
Paths Significance
– Non significant paths can imply that the model was not specified correctly or that the theoretical assumptions behind the paths is not correct.

The translation of Iacobucci’s full model will come in the next post.

Advertisements
Beginning with SEM in lavaan II

One thought on “Beginning with SEM in lavaan II

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s