Multiple-mediator analysis with lavaan

May 6, 2017February 25, 2020 paolotoffaninexample, introduction, lavaan, mediation analysis, multiple mediation, R, simple mediation

I wrote this brief introductory post for my friend Simon. I want to show how easy the transition from SPSS to R can be. In the specific case of mediation analysis the transition to R can be very smooth because, thanks to lavaan, the R knowledge required to use the package is minimal. Analysis of mediator effects in lavaan requires only the specification of the model, all the other processes are automated by the package. So, after reading in the data, running the test is trivial.

This time, to keep the focus on the mediation analysis I will skip reading-in the data and generate a synthetic dataset instead. This is because otherwise I would have to spend the next paragraph explaining the dataset and the variables it contains and I really want to only focus on the analysis.

Create toy data:

# make up some data with one mediator
set.seed(06052017)
X <- rnorm(100)
M <- 0.5*X + rnorm(100)
Y <- 0.7*M + rnorm(100)
Data <- data.frame(X = X, Y = Y, M = M)

# add another mediator
M2 <- -0.35*X + rnorm(100)
Y <- 0.7*M2 + 0.48*-M + rnorm(100)
Data <- data.frame(X = X, Y = Y, M1 = M, M2 = M2)

As shown in the lavaan website performing a mediation analysis is as simple as typing in the code below:

simpleMediation <- '
	Y ~ b * M + c * X
	M ~ a * X
	indirect := a * b
	total    := c + (a * b)
'
require(lavaan)
fit <- sem(model = simpleMediation, data = Data)
summary(fit)

For multiple mediators one simply need to extend the model recycling the code of the first mediator variable:

multipleMediation <- '
Y ~ b1 * M1 + b2 * M2 + c * X
M1 ~ a1 * X
M2 ~ a2 * X
indirect1 := a1 * b1
indirect2 := a2 * b2
total := c + (a1 * b1) + (a2 * b2)
M1 ~~ M2
'
fit <- sem(model = multipleMediation, data = Data)
summary(fit)

Note that with multiple mediators we must add the covariance of the two mediators to the model. Covariances are added using the notation below:

M1 ~~ M2

There are two ways to test the null hypothesis that the indirect effect are equal to each other. The first is to specify a contrast for the two indirect effects. In the definition of the contrast the two indirect effects are subtracted. If it is significant the two indirect effects differ.

contrast := indirect1 – indirect2

# including a contrast in the model
contrastsMediation <- '
Y ~ b1 * M1 + b2 * M2 + c * X
M1 ~ a1 * X
M2 ~ a2 * X
indirect1 := a1 * b1
indirect2 := a2 * b2
contrast := indirect1 – indirect2
total := c + (a1 * b1) + (a2 * b2)
M1 ~~ M2
'
fit <- sem(model = contrastsMediation, data = Data)
summary(fit)

The second option to determine whether the indirect effects differ is to set a constrain in the model specifying the two indirect effect to be equal.

indirect1 == indirect2

Then, with the anova function one can compare the models and determine which one is better. Including the constrain and comparing the models is simple:

constrainedMediation <- '
Y ~ b1 * M1 + b2 * M2 + c * X
M1 ~ a1 * X
M2 ~ a2 * X
indirect1 := a1 * b1
indirect2 := a2 * b2
total := c + (a1 * b1) + (a2 * b2)
# covariances
M1 ~~ M2
# constrain
indirect1 == indirect2
'
noConstrFit <- sem(model = multipleMediation, data = Data)
constrFit <- sem(model = constrainedMediation, data = Data)
anova(noConstrFit, constrFit)

In my case the test is not significant so there is no evidence that the indirect effects are different.

For these toy models there is no further need of customizing the calls to sem. However, when performing a proper analysis one might prefer to have bootstrapped confidence intervals. The bootstrap standard errors and the quantity of bootstrap samples to perform are controlled in the call to the sem function with the additional arguments se' andbootstrap’, as in:

fit <- sem(
	model = contrastsMediation,
	data  = Data,
 	se = "bootstrap",
 	bootstrap = 5000 # 1000 is the default
)

(Bootstrap) confidence interval can be extracted with the function calls 1) summary, 2) parameterEstimates, or 3) bootstrapLavaan

summary(fit, fit.measures=TRUE, standardize=TRUE, rsquare=TRUE,
estimates = TRUE, ci = TRUE)

parameterEstimates(fit, boot.ci.type="bca.simple")

bootstrapLavaan(fit, R=5000, type="bollen.stine",
FUN=fitMeasures, fit.measures="chisq")

NOTE that bootstrapLavaan will re-compute the bootstrap samples requiring to wait as long as it took the sem function to run if called with the bootstrap option.

Since this post is longer than I wanted it to be, I will leave as a brief introduction to mediation with lavaan. In this follow up post I describe multiple mediation with lavaan using an actual dataset. And if you might be interested in how to automate the plotting of lavaan’s output into a pretty graph read this post. On github is the whole code in one .R file.

—- UPDATE Feb 25th 2020 —-
I wrote a few posts that follow up on this from a structural equation modelling perspective which I believe can clarify some of the issues raised in the comments below. Here is the first post.

For customization of the plots which can be created using semPlot see this post too.

60 thoughts on “Multiple-mediator analysis with lavaan”

Multiple-mediation example – paolotoffanin says:

[…] Multiple-mediator analysis with lavaan […]

LikeLike

June 4, 2017 at 1:10 pm Reply
1. RoseB says:
  
  Hi thank you for this post, it’s very helpful. I am trying to plot a multiple mediation analysis (with 5 mediators) but semPlot looks awful because the mediators are all on the same level as the dependent variable, and so you can’t see the paths. Ideally I would like the independent variable to be at the top (or on the left), the mediators to be in the middle, and the dependent variable to be at the bottom (or on the right). It does not seem to be possible to do this in semPlot, but I noticed in your response to Lola M on July 6th that you had suggested doing this. I would be extremely grateful if you might be able to share some code to show how to do this. Many thanks.
  
  LikeLike
  
  February 24, 2020 at 4:35 pm Reply
  1. paolotoffanin says:
    
    Hi RoseB,
    I am answering from my phone so apologies for brevity and potential typos.
    In this post https://paolotoffanin.wordpress.com/2018/06/12/beginning-with-sem-in-lavaan-ii/ there is an explanation under “path diagram” about how to set up the layout of the plot to feed to the semplot function. Maybe that’s helpful enough? Otherwise let me know. … Goodluck!
    
    LikeLike
    
    February 24, 2020 at 5:30 pm
  2. RoseB says:
    
    Hi, thank you so much for the fast reply. I am trying to follow along with your example in that post (using the ‘iacobucci2.txt’ file) but I am unable to fit the models because I get error messages, eg. when I run ‘mdlFit <- cfa(mdlStx, sample.cov=costumerQuality, sample.nobs=n)' I get Error in file(file, "r") : cannot open the connection. In addition: Warning message: In file(file, "r") : cannot open file '.
    Apologies for the bother here. I will try to follow how to set up the layout without example data anyway!
    
    LikeLike
    
    February 24, 2020 at 6:30 pm
  3. paolotoffanin says:
    
    Hi RoseB, I suspect it the error has something to do with R not being able to find the file “iacobucci2.txt”. If you put the file “iacobucci2.txt” in the same directory from which you run R it should work.
    
    LikeLike
    
    February 25, 2020 at 8:51 am
  4. RoseB says:
    
    As an update, i was able to follow without using the data from the prior post. Thank you so much, I’m very grateful!
    
    LikeLiked by 1 person
    
    February 24, 2020 at 6:38 pm
2. RoseB says:
  
  Dear Paolo,
  Apologies for the additional question but I wondered whether you knew if it was possible to adjust the length of the edges/lines in semPlot? I tried to do this by editing the layout but it didn’t work. No problem if you don’t know. Many thanks again
  
  LikeLike
  
  February 25, 2020 at 11:54 am Reply
  1. paolotoffanin says:
    
    Hi RoseB,
    I think it should be, since one should be able to access all the graph properties once they are saved into an R object. However, I think it might take you quite a while to figure out how to set things. Therefore I wonder whether it would not quicker to export your graph and fine tunes these things with a program like inkscape rather than programmatically in R.
    On the other hand, if you want more help into looking into this maybe it’s easier if you write me an email with some data and code so that I can try out things…
    
    LikeLike
    
    February 25, 2020 at 3:08 pm
3. Jason says:
  
  Hi, thanks a lot for this. It’s really helpful!
  I have a question about using lavaan with three mediators. I’ve been unable to find a solution on Cross Validated, so remembering this useful post, thought I’d try my luck.
  I have M1, M2 and M3. X to M1 and M2 are parallel mediators to M3. M3 subsequently mediates Y. I also have covariates (covs). How would you code for the indirect effect of all mediators on Y? ( the ‘indriectM1M2M3’ path below). Here’s my model:
  
  M1 ~ a1*X + covs
  M2 ~ a2*X + covs
  M3 ~ a3*M1 + a4*M2 + X
  Y ~ b*M3 + c*X + covs
  
  IndirectM1M3 := a1*a3*b
  IndirectM2M3 := a2*a4*b
  IndirectM1M2M3 := (a1*a3*b) + (a2*a4*b)
  
  Direct := c
  
  Total := c + (a1*a3*b) + (a2*a4*b)
  
  Thanks for any advice you can offer.
  
  Jason
  
  LikeLike
  
  August 28, 2020 at 8:29 am Reply
4. laughingthroughgradschool says:
  
  Any good example I can use for reporting results appropriately?
  
  LikeLike
  
  May 26, 2021 at 10:50 pm Reply
Patricia says:

Your blog is super helpful, thank you for the contents. Could you explain a little bit more why a one must add the covariance between the two moderators (M1 ~~ M2)? Could you suggest a paper that explores this? Thank you.

LikeLike

July 3, 2017 at 6:34 am Reply
1. paolotoffanin says:
  
  Hi Patricia, I do not know of a paper exploring this topic in specific. Preacher and Hayes (2008) talk about this briefly, maybe you can find more references in their paper. That is also the paper I used to translate the mPlus formula to lavaan, see the `with’ clause in the mplus statements in appendix A.
  
  LikeLike
  
  July 3, 2017 at 7:40 am Reply
  1. Patricia says:
    
    Thanks, the paper solves my question!
    
    LikeLiked by 1 person
    
    July 6, 2017 at 5:53 am
Plotting multiple mediation – paolotoffanin says:

[…] Multiple-mediator analysis with lavaan […]

LikeLike

July 27, 2017 at 5:27 am Reply
LEMAITRE says:

Thanks for the post.
Is the same approach feasible with three mediators?

LikeLike

September 6, 2017 at 3:52 pm Reply
1. paolotoffanin says:
  
  Yes it is. You will have to extend the model to accommodate for a third mediator. Starting from modeling the mediator itself, adding indirect and total effects plus modeling the covariances of the 3 mediators which would be m1~~m2, m1~~m3 and m2~~m3. I suggest you then check with semPlot whether you created all the paths… Goodluck!
  
  LikeLike
  
  September 7, 2017 at 6:21 am Reply
  1. Charlotte says:
    
    Dear Paolotoffanin,
    Thank you for the post. It is so nicely written out and so helpful.
    I am running a 3 mediator mediation, following the instructions here. I first conducted cfa; I then want to compare the partial mediational model and the full mediation model. Here comes my questions: my cfa fit indices somehow are identical to my partial mediational model but if I take away the covariance part (i.e., m1~~m2, m1~~m3 and m2~~m3), the indices between cfa and partial mediational model would be different. I don’t think the fit indices should be identical but I also don’t think that I should take away the covariance of the mediators. Please kindly help me out. Thank you!
    
    LikeLike
    
    December 3, 2019 at 4:19 am
  2. paolotoffanin says:
    
    Dear Charlotte, I am not sure I understand but I fear there might be some problem with the models specifications. If the difference between the two models that you fit is only the presence/absence of the covariances then that is OK. If you fit the cfa and then, separately a mediation model (without the cfa part), I am not sure how you could get the same indexes since the indexes should be different. If, on the other hand, you are comparing the indexes fitted with cfa to the cfa indexes of a model in which you first fit cfa and then the structural equations it is OK to have the same indexes… (maybe my follow up post on starting with structural equation modelling can help figuring this out?)
    
    LikeLike
    
    December 3, 2019 at 7:25 am
  3. Charlotte says:
    
    Thank you for your super kind and prompt help! Yes I am looking forward to your posts on SEM! Is it on your blog already?
    
    What I am trying to do now is to compare the partial mediational (model 1) and full mediational model (model 2). But before the comparison,I I did run cfa first so that, to best of my knowledge, I am running a full SEM model (measurement model + structural model). What I understand from your reply is that it is okay to have the same indexes if I do it this way. But the reference article* for this study of mine, doesn’t provide the same indices for the cfa (measurement model) and structural model 1 (partial mediational model). That’s why I wonder if I have done anything wrong.
    
    I am new to R and SEM; if you don’t mind, could you please help me check my R script as below?
    Again, thank you very much for helping!!! =)
    
    Gratefully,
    Charlotte
    
    ##—
    model.cfa =’
    PI =~ PI1+PI2+PI3+PI4+PI5+PI6+PI7
    TS =~ TS1+TS2+TS3+TS4+TS5
    PS =~ PS1+PS2+PS3+PS4+PS5
    FS =~ FS1+FS2+FS3+FS4+FS5
    SLSS =~ SLSS1+SLSS2+SLSS3+SLSS4+SLSS5+SLSS6+SLSS7
    ‘
    ##run the model
    model.cfa.fit = cfa(model.cfa, data = purpose_SEM)
    ###summaries
    summary(model.cfa.fit,
    standardized=TRUE,
    rsquare=TRUE,
    fit.measures = TRUE)
    parameterestimates(model.cfa.fit, standardized = TRUE) ##CI for parameters
    —-
    model1=’
    PI =~ PI1+PI2+PI3+PI4+PI5+PI6+PI7
    TS =~ TS1+TS2+TS3+TS4+TS5
    PS =~ PS1+PS2+PS3+PS4+PS5
    FS =~ FS1+FS2+FS3+FS4+FS5
    SLSS =~ SLSS1+SLSS2+SLSS3+SLSS4+SLSS5+SLSS6+SLSS7
    
    SLSS ~ b1 * TS + b2 * PS + b3 * FS + c * PI
    TS ~ a1 * PI
    PS ~ a2 * PI
    FS ~ a3 * PI
    
    indirect1 := a1 * b1
    indirect2 := a2 * b2
    indirect3 := a3 * b3
    
    total := c + (a1 * b1) + (a2 * b2) + (a3 * b3)
    TS ~~ PS
    TS ~~ FS
    PS ~~ FS
    ‘
    fit1 <- sem(model = model1, data = purpose_SEM, se = "bootstrap")
    summary(fit1, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE, estimates = TRUE, ci = TRUE)
    parameterEstimates(fit1, boot.ci.type='bca.simple',standardized = TRUE)
    —
    model2='
    PI =~ PI1+PI2+PI3+PI4+PI5+PI6+PI7
    TS =~ TS1+TS2+TS3+TS4+TS5
    PS =~ PS1+PS2+PS3+PS4+PS5
    FS =~ FS1+FS2+FS3+FS4+FS5
    SLSS =~ SLSS1+SLSS2+SLSS3+SLSS4+SLSS5+SLSS6+SLSS7
    
    SLSS ~ b1 * TS + b2 * PS + b3 * FS + 0 * PI
    TS ~ a1 * PI
    PS ~ a2 * PI
    FS ~ a3 * PI
    
    indirect1 := a1 * b1
    indirect2 := a2 * b2
    indirect3 := a3 * b3
    
    total := (a1 * b1) + (a2 * b2) + (a3 * b3)
    TS ~~ PS
    TS ~~ FS
    PS ~~ FS
    '
    fit2 <- sem(model = model2, data = purpose_SEM, se = "bootstrap")
    summary(fit2, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE)
    parameterEstimates(fit2, standardized = TRUE)
    —–
    ##—compare: model 1 (partial) vs model 2 (full)
    anova(fit1, fit2)
    
    * Lim, S. A., & You, S. (2019). Effect of parental negligence on mobile phone dependency among vulnerable social groups: Mediating effect of peer attachment. Psychological reports, 122(6), 2050-2062.
    
    LikeLike
    
    December 4, 2019 at 12:34 am
LEMAITRE says:

Thanks for you help. I did what you recommended to do and it worked.
Now, I’m little bit confused. For practicing, I changed the order of the two mediators in your example and I noticed that the “total effect” is not the same. Is there a reason for that?

Defined Parameters:
Estimate Std.Err z-value P(>|z|)
indirect1 -0.237 0.087 -2.720 0.007
indirect2 -0.287 0.088 -3.257 0.001
total -0.328 0.206 -1.596 0.110

Defined Parameters:
Estimate Std.Err z-value P(>|z|)
indirect1 -0.287 0.088 -3.257 0.001
indirect2 -0.237 0.087 -2.720 0.007
total -0.429 0.200 -2.139 0.032

LikeLiked by 1 person

September 7, 2017 at 8:31 am Reply
1. paolotoffanin says:
  
  I could not reproduce your data. How did you invert the mediators? Using this formula:
  Y ~ b2 * M2 + b1 * M1 + c * X
  I get the exact same indirect and total effects for both models. If you like you could mail me your model specification [varati at hotmail dot com] and I can see whether I figure out what is different between our approaches. By the way. Your comment point out to me that I made a mistake in the specification of the total effect which I did not spot before.
  
  total := c + (a1 * b1) + (a1 * b1)
  should be:
  total := c + (a1 * b1) + (a2 * b2)
  
  Thank you very much for sending me into the right direction!
  
  LikeLike
  
  September 7, 2017 at 5:38 pm Reply
Lemaitre says:

I just did
Data <- data.frame(X = X, Y = Y, M1 = M2, M2 = M) instead of
Data <- data.frame(X = X, Y = Y, M1 = M, M2 = M2)

LikeLike

September 7, 2017 at 7:34 pm Reply
1. paolotoffanin says:
  
  That explains the difference. The reason you got different total effect is the mistake in the formula of the total effect
  total := c + (a1 * b1) + (a1 * b1)
  should be
  total := c + (a1 * b1) + (a2 * b2)
  If you try the proper formula the total effects are identical. The reason why it happened is that, the wrong formula was using the first mediator for both estimates. Therefore, inverting the order of the mediators yielded different estimates.
  
  LikeLike
  
  September 8, 2017 at 11:02 am Reply
  1. Lemaitre says:
    
    Thanks a lot. Now everything is ok.
    
    LikeLiked by 1 person
    
    September 8, 2017 at 12:10 pm
Bogdan says:

When I write this syntax for the model, I receive this error:

lavaan ERROR: unknown label(s) in variable definition(s): a1 b1 a2 a3 a4 b2 a5 a6 b6 c1 c2 c3

I’m not using R for a long time, so this could be a beginner’s question. Thank you for posting this! It is really helpful.

LikeLike

October 1, 2017 at 9:41 pm Reply
1. paolotoffanin says:
  
  Hi Bogdan, could you please provide more context? In my example I use those as coefficients which are then estimated by lavaan. Because you do not provide enough context I cannot know whether those are you variables’ names or coefficients which should be estimated…
  
  LikeLike
  
  October 3, 2017 at 5:55 am Reply
Corrie says:

Thank you so much for your easy to follow directions and code.

LikeLiked by 1 person

November 26, 2017 at 2:39 am Reply
Marishka Usacheva says:

Thank you, this is such a clear, concise, and informative article with an elegant set of solutions and helpful links for further information!

LikeLiked by 1 person

December 5, 2017 at 2:28 am Reply
Alexander says:

Hi,

Your multiple mediator model example seems to be just identified (has 0 degrees of freedom) and thus does not estimate model fit. The same happens with my own two mediator model, which is essentially identical but has a few control variables added. I could live with this, if I knew it was a necessary property of multiple mediator models (with no latent variables). Unfortunately, I could not confirm this anywhere. Indeed, Appendix B in the above cited Preacher & Hayes 2008 paper has an SPSS output which provides fit statistics.

So, I wonder if you could give me any pointers whether I should be worried about the lack of model fit statistics and more broadly, why some multiple mediator SEM models are just identified and others are not.

It’s much appreciated.
Alexander

LikeLiked by 1 person

March 8, 2018 at 12:41 pm Reply
1. paolotoffanin says:
  
  Hi Alexander, thank you for the question, I think I could develop this into a follow up post. In the mean time I hope I can give you some elements to get going. I believe just identified models are not per se problematic, as far as the theory behind them is solid. About determining the goodness of fit of your model you can look at RMSEA, SRMR, and CFI (e.g. fitMeasures(mdlFit, “cfi”)). Engel, Moosbrugger and Muller (2003) give a nice primer (Evaluating fit of structural equation models…). Another paper I also like is ‘A Meditation on mediation’, it provides a way to estimate the effect size of the mediation effect. As an extension to that I also compute AIC of the given model, then invert the mediator with the independent variable and check how AIC and effect size of the mediation effect changes. This procedure might provide more insights just by comparing one model with another, but then again, when using numbers basically any test is possible, but it remains to establish whether the test itself makes sense. The bottom line for me is that the theory has to drive the testing. Therefore I prefer a model with a less nice fit but which provides a better understanding to a model with a very good fit but making the theoretical explanation more complex.
  
  LikeLike
  
  March 9, 2018 at 12:24 pm Reply
2. paolotoffanin says:
  
  p.s.: adding “fit.measures=TRUE” to the summary call also provides some of the fit indexes that I mention above. By adding ‘rsquare=TRUE’ one can ask for R^2 as well. HTH
  
  LikeLike
  
  March 9, 2018 at 12:26 pm Reply
  1. Alexander says:
    
    Thanks for the reply! I’m looking forward to the new post!
    
    Just to clarify, what I was referring to is that your “multipleMediation” model from above and my own model both result in perfect fit(?): CFI = TFI =1, RMSEA = SRMR = 0, p-value that RMSEA <= 0.05 is NA. It is true, your model estimates AIC & BIC (in my model, these are missing for whatever reason). Perhaps, I was wrong to assume that this is due to the model being just identified. I was mainly assuming this, because if the covariance of the two mediators is not estimated, the fit measures become more meaningful both in your example and in my own model.
    
    Disclaimer, I haven't yet looked into the papers you recommend.
    
    LikeLike
    
    March 9, 2018 at 2:17 pm
  2. paolotoffanin says:
    
    The paper won’t help you with the ‘perfect fit (?)’ issue I am afraid. I am not sure why AIC and BIC would not be estimated in your model, but I know there is a google group for sem and one for lavaan and maybe they will be able to provide more insight that I could?
    
    LikeLike
    
    March 13, 2018 at 8:08 am
Sarah says:

Hi,
Thank you so much for this helpful post!
So how would control variables fit into this? I have been trying to add 3 controls to all legs of the mediation, but I am wondering how to address the indirect effect. Here is what I have:

model4 <- ' # DIRECT EFFECT
Y ~ c*X
# mediator
M ~ a1*X + a2*gender + a3*race + a4*SES
Y ~ b1*M + b2*gender + b3*race + b4*SES
# INDIRECT EFFECT (a*b)
ab := a1*b1
# TOTAL EFFECT
total := c + ((b1+b2+b3+b4)*(a1+a2+a3+a4))
'
emoabu.ind4 <- sem(model.emoabu4, data = SPIN, se = "bootstrap")
summary(emoabu.ind4)

Do I need to have multiple lines for the indirect and total effects?

Thank you so much!

LikeLike

May 31, 2018 at 6:41 pm Reply
1. paolotoffanin says:
  
  HI Sarah, I am sorry but I am afraid I do not understand what you are trying to do. Do you have one mediator and three control variables? Then maybe you are looking for something like this:
  
  # DIRECT EFFECT
  Y ~ b1*M + c1*X + c2*gender + c3*race + c4*SES
  # mediator
  M ~ a1*X + a2*gender + a3*race + a4*SES
  # INDIRECT EFFECT (a*b)
  ab := a1*b1
  # TOTAL EFFECT
  total1 := c1 + (a1 * b1)
  total2 := c2 + (a2 * b1)
  total3 := c3 + (a3 * b1)
  total4 := c4 + (a4 * b1)
  
  If you have semPlot installed try to plot the fit that you get from the sem() function to see determine whether you successfully represented all the path that you wanted to include in your analysis.
  
  LikeLike
  
  June 1, 2018 at 1:53 pm Reply
  1. Sarah says:
    
    Thank you so much!
    
    LikeLiked by 1 person
    
    June 1, 2018 at 4:28 pm
Nichola Raihani says:

Thanks for this really helpful post. I have a question on why the effect sizes I obtain with lavaan might differ from those I obtain using cumulative link models (clm, r package: ordinal) , which yields standardized beta estimates for predictor variables.

I have a dependent variable, Y, which is specified as an ordinal categorical variable (with 6 levels, where 1 < 2 < 3 < 4 < 5 < 6). I have my predictor, X, and also 2 mediators (M1 and M2, where M1 ~~ M2), and 3 other 'control' variables (specified as B, C and D, respectively). So my mediation model looks like this:

mediation.model.3<- '
Y ~ b1*M1 + b2*M2 + c1*X + c2*B + c3*C + c4*D
M1 ~ a1*X + a3*B + a5*C + a6*D
M2 ~ a2*X + a4*B + a7*C + a8*D
direct := c1
indirect1 := a1*b1
indirect2 := a2*b2
total := c1 + (a1*b1) + (a2*b2)
M1 ~~ M2 '

The thing I am confused about is that my predictor variable X has a large 'significant' beta coefficient when I run the full model using clm (r package: ordinal) – but when I look to see if X is mediated by M1 or M2 in package lavaan, the effect of X itself is no longer significant (i.e. the CI associated with the total effect include zero).

I'm wondering if I am violating the assumptions of lavaan by fitting an ordered categorical response term as the DV (Y). Do you know? Thank you!

LikeLike

June 4, 2018 at 11:07 am Reply
1. paolotoffanin says:
  
  Hi Nichola, I don’t know what the clm function does, I have never used the ordinal package before…
  I do not have experience with fitting ordered categorical variables in lavaan, my experience with lmer and glm says that they give an error if things do not make sense. I am not sure I am allowed to infer that everything is working fine if you did not get an error when you fitted your model, maybe asking to the lavaan google group whether lavaan handles ordered categorical response terms will provide you a better answer than mine.
  On the other hand, note that there is a bunch of people arguing that mediation does not require a significant total effect of X on Y (a list of references can be found in the Preacher and Hayes’ paper I cite above). Good luck!
  
  LikeLike
  
  June 6, 2018 at 6:26 am Reply
Nichola Raihani says:

Thanks very much – helpful suggestions!

LikeLiked by 1 person

June 6, 2018 at 8:27 am Reply
Lola M. says:

Thank you so much for this series of posts on lavaan and multiple mediation. I am testing a model that has three mediators, one IV, and one DV (no controls). You suggested to a different commenter to use semPath to ensure that the model has been correctly identified. I followed this step but I’m still not positive my syntax is correct. Here is a link to what my plotted coefficients look like: https://ibb.co/k82mey. The mediators are FSO, SAB, and SRS. The predictor is MMS and the outcome is DVM.

And, here is the syntax:

outmodelmovie <-"
DateViolMean ~ c*MovieMonthSum + b1*SRS8Mean + b2*SABWMn + b3*FSexO10Mn

#mediator models
SRS8Mean ~ a1*MovieMonthSum
SABWMn ~ a2*MovieMonthSum
FSexO10Mn ~ a3*MovieMonthSum

#indirect effects (IDE)
SRS8MeanIDE := a1*b1
SABWMnIDE := a2*b2
FSexO10MnIDE := a3*b3
sumIDE := (a1*b1) + (a2*b2) + (a3*b3)

#total effect
total := c + (a1*b1) + (a2*b2) + (a3*b3)

#model correlation between mediators
SRS8Mean ~~ SABWMn
SRS8Mean ~~ FSexO10Mn
SABWMn ~~ FSexO10Mn
"

LikeLike

July 5, 2018 at 6:03 pm Reply
1. paolotoffanin says:
  
  Hi Lola, your syntax appears correct to me, as is the path diagram generated by semPath. I do agree that the diagram looks a bit confusing, but if you look at the arrows connecting the variables and you check the directionality you will notice that some arrow only go in one direction (e.g. SAB points to DVM, but not the other way around). The arrows pointing in both directions identify covariances as, for example, the double-headed arrow connecting SAB and FSO. If I might suggest an improvement to your plot, you could try playing around with the layout in semPath, and then position the IV on the left, mediators in the middle and DV on the right. I think this arrangement could make the plot more intuitive to read. If you want to give this a try I described how to achieve a similar result in this post
  I think your analysis looks sound, but you could add two things:
  1 – mediation effect size (sumIDE / total) – see (“A meditation on mediation…”)
  2 – if you like to show if a specific indifrect effect through a mediator is larger than another you can contrast the mediators by subtracting the indirect effects from one another (e.g. SRS8MeanIDE – SABWMnIDE; SRS8MeanIDE – FSexO10MnIDE; SABWMnIDE – FSexO10MnIDE). HTH!
  
  LikeLike
  
  July 6, 2018 at 11:16 am Reply
  1. Lola M. says:
    
    Thank you very much for these suggestions! They are greatly appreciated.
    
    LikeLike
    
    July 6, 2018 at 3:57 pm
  2. Lola M. says:
    
    I tried to contrast the mediators but I didn’t see any results in the output. Here is my syntax. Can you spot what I might be doing wrong?
    
    outmodelTV <-"
    DateViolMean ~ c*PopTV52 + b1*SRS8Mean + b2*SABWMn + b3*FSexO10Mn
    
    #mediator models
    SRS8Mean ~ a1*PopTV52
    SABWMn ~ a2*PopTV52
    FSexO10Mn ~ a3*PopTV52
    
    #indirect effects (IDE)
    SRS8MeanIDE := a1*b1
    SABWMnIDE := a2*b2
    FSexO10MnIDE := a3*b3
    sumIDE := (a1*b1) + (a2*b2) + (a3*b3)
    
    #contrasts to determine if the IDEs differ
    contrast1 := SRS8MeanIDE – SABWMnIDE
    contrast2 := SRS8MeanIDE – FSexO10MnIDE
    contrast3 := SABWMnIDE – FSexO10MnIDE
    
    #total effect
    total := c + (a1*b1) + (a2*b2) + (a3*b3)
    
    #covariates
    SRS8Mean ~~ SABWMn
    SRS8Mean ~~ FSexO10Mn
    SABWMn ~~ FSexO10Mn
    "
    
    fit <-sem(outmodelTV, data=mv, se="bootstrap", bootstrap=10000)
    
    #view output
    summary(fit, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)
    
    #bootstrap CIs
    bootfitTV <-parameterEstimates(fit, boot.ci.type="bca.simple",level=0.95, ci=TRUE,standardized = FALSE)
    bootfitTV
    
    LikeLike
    
    July 7, 2018 at 6:10 pm
  3. paolotoffanin says:
    
    Hi Lola, Do I understand correctly that you do not see the results with the name contrast1, 2 and 3 in the output of the calls to either summary or parameterEstimates?
    I tried to reproduce your error by fitting the model you describe above and applying the fit to a dataset after renaming some variables to your names, but I could not reproduce your problem. At the bottom of Defined Parameters, above ‘total’ I have the three contrasts.
    Maybe it’s an issue of earlier versions of lavaan? Then a solution could be to write out the whole difference instead of using labels, as in:
    contrast1 := a1 * b1 – a2 * b2
    instead of:
    contrast1 := indirect1 – indirect2.
    Does this sort things out?
    
    LikeLike
    
    July 9, 2018 at 7:47 am
Lola M. says:

That worked! It may be the version of lavaan that was the issue. Thank you very much.

Finally, in your previous comment you suggested that I add the mediation effect size. I’m assuming that means that it’s not being captured in the sumIDE part of the syntax. What can I add to produce the effect size? Does that mean my c’ is NOT the coefficient next to sumIDE in the output?

LikeLike

July 10, 2018 at 10:40 pm Reply
1. paolotoffanin says:
  
  Hi Lola, nice that it worked.
  About the effect size of the meditation effect or ‘proportion of mediation’ (Iacobucci, Saldanha, Deng, 2007): it’s indeed not captured by the sumIDE, but it’s the ration between the sumIDE and your total effect. Rather than fitting the whole model with that formula included, you can also compute it in R taking the estimate of ‘sumIDE’ and dividing it by the estimate of ‘total’.
  
  LikeLike
  
  July 11, 2018 at 6:43 am Reply
  1. Lola M. says:
    
    Thank you so much for all your assistance. It’s truly appreciated. I’ve been struggling to find a good resource for learning about serial mediation, so if you know of any textbooks, I would love the recommendation. (I’ll also read the Iacobucci article!)
    
    LikeLike
    
    July 11, 2018 at 5:29 pm
  2. paolotoffanin says:
    
    Hi Lola, that is a very specific topic, I’m not sure you’ll find a whole book on it. A good introduction to lavaan and CFA is the book of Beaujean, but there isn’t any mediation, and very few SEM. On the internet there is quite a bit of materials, so maybe you’ll have to stitch together the pieces of wisdom you find in different webpages/articles. But if you find something that you think is very helpful please let me know ;-P
    
    LikeLike
    
    July 12, 2018 at 6:31 am
Christoff says:

Good day – first of all, this post has already been helpful. However, I have a question about combining moderation and mediation using lavaan.

In my example, I have 2 mediators (M1 & M2) and 2 moderators (C & D). My predictor is X and my dependent variable is Y. So far I know how to handle multiple mediators:

multipleMediation <- '
Y ~ b1 * M1 + b2 * M2 + c * X
M1 ~ a1 * X
M2 ~ a2 * X
indirect1 := a1 * b1
indirect2 := a2 * b2
contrast := indirect1 – indirect2
total := c + (a1 * b1) + (a2 * b2)
M1 ~~ M2
'
How should I add the moderators to this example?
Any feedback would be much appreciated!

Kind regards,
Christoff

LikeLike

August 7, 2019 at 10:52 am Reply
Savannah says:

Hello — Thank you for this helpful post!!! I am running serial mediation with two mediators (Hayes Process Model 6) in lavaan based on instructions from the following link: https://stats.stackexchange.com/questions/340857/serial-mediation-in-r-how-to-setup-the-model.

This is how I am currently setting up the model:
model <- "
m1 ~ a1 * x
m2 ~ a2 * x + d21 * m1
y ~ cp * x + b1 * m1 + b2 * m2
ind_eff := a1 * d21 * b2
"
fit <- lavaan::sem(model = model, data = tmw, se = "boot", bootstrap = 1000)
lavaan::parameterestimates(fit, boot.ci.type = "bca.simple")

However, I also have a control variable that I would like to include and need some assistance on how best to do this.

Thank you!

LikeLike

February 28, 2020 at 8:55 pm Reply
1. paolotoffanin says:
  
  Hi Savannah, I am a bit short in time this week to help you, but I’ll try to give you some hints to continue.
  The model set-up you got from stackoverflow is nice, and its extension with inclusion of a control variable shouldn’t be too complex. Try to merge my example with this one and you will be set to go. One way to make this easy is to draw (really, with pen and paper actually draw) your model with I/D mediators and control variables and draw the path (i.e. lines) connecting them. You then need a coefficient for each line. You already have the model set-up from the stackoverflow post, now simply add the control variable. If you set up the model like that and you still need further assistance maybe we can discuss how to continue further. Good luck.
  
  LikeLike
  
  March 2, 2020 at 8:44 am Reply
  1. Paul says:
    
    Hello, first, thanks for your illustration. Really very helpful!
    
    I am also dealing with serial mediation (with 2 mediators). Based on your illustration, I am wondering if the model should be specified like below (vs. the one set up in the stackoverflow):
    
    Y ~ b1 * M1 + b2 * M2 + c * X
    M1 ~ a1 * X
    M2 ~ a2 * X + d * M1
    indirect1 := a1 * b1
    indirect2 := a2 * b2
    indirect3 := a1 * d * b2
    total := c + (a1 * b1) + (a2 * b2) + (a1 * d * b2)
    
    I have another question. Since all my X/Y/M1/M2 are latent variables (with some observed variables in each of them), then I should first define these latent variables, before running the above multiple serial mediation analysis, am I correct?
    
    LikeLike
    
    September 2, 2020 at 2:39 pm
  2. Paul says:
    
    First I would like to say thank you for the illustration.
    
    The model I am dealing with is also a serial mediation. I wonder why the stackoverflow includes the indirect effect as “ind_eff := a1 * d21 * b2” only. Why doesn’t the model include 3 indirect effects:
    
    indirect 1 := a1 * b1
    indirect 2 := a2 * b2
    indirect 3 := a1 * d21 * b2
    
    My another question is that, if all my y/m1/m2/x are latent variables with their own observed variables, I should define them using the “=~” first before going to the mediation analysis, right? Are there other things I have to take care?
    
    LikeLike
    
    September 3, 2020 at 3:51 pm
itsannamiss says:

Do you have any advice for how to perform mediation analysis using two correlated mediators?

LikeLike

March 25, 2020 at 5:30 pm Reply
1. paolotoffanin says:
  
  Hmm, not really. In this paper (http://paolo.mp-concepts.net/PubFolder/Dalley2012.pdf) our mediators are moderately (negatively) correlated. I believe if the two mediators are very highly correlated (i.e. above 50%) I am not sure they might not yield very different conclusions… Structural equation models are mostly justified through theory so I would suggest to formulate sensible paths…
  
  LikeLike
  
  March 27, 2020 at 6:37 am Reply
Whitney says:

Hi, thank you so much for this post, it’s been so helpful. However, I did run into one problem, when I tried to test the null hypothesis by specifying a contrast or setting a constrained model as you stated I keep running into errors.

When I run the contrast model and look at the output of the model, I see that values have been generated for the estimates, but standard error, probability etc all appear as NA. – this is despite the fact that when I run the original multiple mediation model (minus specifying the contrast), the model runs fine and all the relevant output is generated.

When I try the second method using the constrained model, the output (for the constraints) shows as:
Constraints:
|Slack|
indirect1 – (indirect2) 0.000

And when I try to run the ANOVA analysis to compare the original mediation model and the constrained model, I receive this error message:
Error in if (any(STAT.delta[-1] < 0)) { :
missing value where TRUE/FALSE needed

I have tried everything I can think of and would really appreciate your input on why this might be. Thank you.

LikeLike

June 9, 2021 at 2:22 pm Reply
1. paolotoffanin says:
  
  Hi Whitney, I am not really sure I understand what you are doing, but I am sure that if the model has NA as estimates anova will likely not work for the model comparison. I think there is probably something off in the specification of your model with the contrast. One thing you could try to do is start new with a simpler model and add parameters one by one so that you can identify what is going wrong. It can also be that if you start new it works straightaway because maybe you typed something erroneously the first time you specified your models. Good luck.
  
  LikeLike
  
  June 15, 2021 at 1:33 pm Reply
Sam Choi says:

Hello, thank you for posting this code. It was very helpful. I was wondering if this code could be run with dichotomous or categorical outcome variable. Thank you!

LikeLike

October 6, 2021 at 2:02 pm Reply
Ben Smith says:

Is it fair to say that this toy model assumes there is no relationship between M1 and M2? I noticed that you added a covariance measure, but in my own testing, adding the covariance line doesn’t actually impose an additional constraint, per se, because everything else remains the same.

In my view, if you thought there _could_ be a relationship between, say, M1 and M2 as sequential mediators of Y and X, then you should add that sequential model, even (especially!) if you are more interested in the role of M1 and M2 as independent mediators. Would you agree or disagree with that?

LikeLike

April 20, 2022 at 9:12 pm Reply
Elena says:

Hey there.
My predictor is categorical and has 5 levels. I did create dummy codes but I continue to receive the following error:

lavaan ERROR: unordered factor(s) with more than 2 levels detected as exogenous covariate(s)

Do you know how to fix this?

LikeLike

June 12, 2022 at 12:26 pm Reply