multilevel sem with pseudobalanced groups and small samplesjoophox.net › papers › 2ndml.pdf ·...

of 18 /18
19-4-99 1 Multilevel SEM with pseudobalanced groups and small samples Joop Hox Cora Maas Utrecht University 2nd Amsterdam Multilevel Conference Amsterdam, March 30-31, 1999

Author: others

Post on 25-Jun-2020

2 views

Category:

Documents


0 download

Embed Size (px)

TRANSCRIPT

  • 19-4-99 1

    Multilevel SEM with pseudobalancedgroups and small samples

    Joop HoxCora Maas

    Utrecht University

    2nd Amsterdam Multilevel ConferenceAmsterdam, March 30-31, 1999

  • 19-apr-99 2

    Structural Equation Modeling (SEM)

    z Multivariate normalityy data summed up perfectly by covariance matrix S

    x + means M

    z Model for Σ contains parameters θy factor loadings, path coefficientsy estimated by Maximum Likelihoody standard errors, overall significance test

    z Softwarey e.g., Lisrel, Eqs, Amos

    z Submodel: Confirmatory Factor Analysis (CFA)

  • 19-apr-99 3

    Two level SEM

    z Assume a population that contains groupsz Sampling at two levels:

    groups and individuals within groups

    z Decompose population Total scores into (disaggregated)group means B and individual deviations W

    z In the population we have ΣT=ΣW+ΣB

    z Decompose the sample scoresz In the sample we have ST=SW+SB

  • 19-apr-99 4

    Multilevel SEM

    z Decompose the sample scores Tinto W (Within) and B (Between)

    z Compute pooled within groups covariances SPWz Compute between groups covariances SB

    z Muthén has shown thatz SPW = ΣW and SB = ΣW + nΣB

    y with n equal to common group size

  • 19-apr-99 5

    Estimating multilevel SEMwith standard SEM software

    z SPW = ΣWy model it with Within model

    z SB = ΣW + nΣBy model it with Within model + (scale factor n) * Between model

    z Use two-group option of standard SEM softwarey group I = Within, group II = Betweeny if groups are unequal, ignore that problem

    use average group size, call it pseudobalanced solution

  • 19-apr-99 6

    Our questions

    z How accurate is the pseudobalanced solution?y parameter estimatesy standard errorsy overall model test

    z Given unbalanced datay varying number of groupsy varying group sizesy varying proportion of group level variance

  • 19-apr-99 7

    The model

    z Two level Confirmatory Factor Analysisz Within model z Within model

    y1

    1

    y2

    1

    w1 w2 w3 w4 w5 w6

    .3 .4 .5 .3 .4 .5

    e1

    .25

    1

    e2

    .25

    1

    e3

    .25

    1

    e4

    .25

    1

    e5

    .25

    1

    e6

    .25

    1

    b1 b2 b3 b4 b5 b6

    r1

    .25

    1

    r2

    .25

    1

    r3

    .25

    1

    r4

    .25

    1

    r5

    .25

    1

    r6

    .25

    1

    X1

    .25

    .5 .4 .3 .5 .4.3

  • 19-apr-99 8

    Simulation procedure

    z Simulated conditionsy number of groups NG

    x 50 - 100 - 200

    y group size (average) GSx 5/15 (10) - 10/30 (20) - 25/75 (50)

    y observed variables group level variance (average ICC) ICCx variance of between factor is 0.25/0.50 (.20/.33)

    z Simulation procedure: do per condition 1000 timesy generate group and individual data, from model + conditiony combine into total scoresy compute SPW and SBy estimate model: store estimates, s.e.’s and chi-square

  • 19-apr-99 9

    Results:Within model

    z No inadmissible solutionsy all variance estimates > 0

    z Factor loadings (.3 .4 .5) mean bias 0.00005 (zero)y no differences across conditions

    z Variances (.25) mean bias -0.0005 (zero)y ‘significant’ effect of sample sizesy e.g., for NG 50/100/200: -0.00095/-0.00049,-0.00018

    z Coverage of 95% confidence intervalsy smallest 95.13%, largest 96.03%

    z Conclusion: No problem!

  • 19-apr-99 10

    Results:Between model

    z Some inadmissible solutionsy 8% of simulations have variance estimates < 0y most of these > 1y Note: program always converged

    z Inspect distribution of inadmissible solutions

    z Analyze effects of NGroups, GSize and ICC foradmissible solutions

  • 19-apr-99 11

    Distribution of between loading b1

    Bias of loading b1 foradmissible solutions

    Bias of loading b1 forinadmissible solutions

  • 19-apr-99 12

    Distribution of between variance ps1

    Bias of variance ps1 foradmissible solutions

    Bias of variance ps1 forinadmissible solutions

  • 19-apr-99 13

    Between model:distribution of inadmissible solutions

    z Inadmissible solutions for different NG, GS and ICCMaineffects:y NG 50/100/200: 18%/1%/0%y GS 10/20/50: 10%/0%/0%y ICC lo/hi 14%/2%

    z Interactions:y NG/ICC lo 29%/10%/1%y NG/ICC hi 6%/1%/0%y GS/ICC similar pattern, less dramaticy NG/GS and 3-way interaction significant but negligible

    z Conclusion: we have a problem!y Largest effect: NG, large GS and ICC compensate in part

  • 19-apr-99 14

    Results Between model,admissible solutions only: bias

    z Factor loadings (.3 .4 .5) mean bias 0.004y effect of NG (bias 0.007/0.004/0.005)y effect of ICC (bias 0.01/0.00)

    z Variances (.25) mean bias -0.01y effect of ICC (bias -0.013/-0.008)y effect of NG (bias -0.011/-0.010/-0.010)

    z Conclusion: No problem!

  • 19-apr-99 15

    Results Between model,admissible solutions only: coverage

    z Coverage of 95% confidence interval

    z Factor loadingsy smallest 89%, largest 94%y small effects of NG, ICC and GS, no real improvements

    x e.g., NG 50/100/200: 89%/90%/91%

    z Variancesy smallest 90%, largest 90%y small effects of NG, GS

    x e.g., NG 50/100/200: 88%/91%/91%

    z Conclusion: standard errors too smally but not totally misleading

  • 19-apr-99 16

    Results Between model,all solutions, overall chi-square test

    z Mean chi-square 19.6 (expected 18): too highy mean chi² significantly > 18 in all conditionsy smallest difference when NG=50, GS=10, ICC low

    z Overall percentage rejections 8.9%y admissible solutions 9.0%, inadmissible solutions 7.3%y effects of ICC and NGroup*ICC

    x ICC lo/hi: 7.9%/9.9%x ICC lo: NG 50/100/200 6.6%/7.9%,9.1%x ICC hi: NG 50/100/200 9.8%/10.5%/9.4%

    z Conclusion: too many rejectionsy large NG, GS and ICC do not help here

  • 19-apr-99 17

    Conclusions

    z Within model, no problemsz Between model, problems:

    y inadmissible solutions (usually grossly wrong)y estimates unbiased, buty standard errors too smally chi-squares too highy result: models too complex

    z Especially with small number of groupsz But results fairly close

    y e.g. if we choose α=0.25, operating α ≈ 0.05

  • 19-apr-99 18

    Some references

    z Muthén, B. (1994). Multilevel covariance structure analysis.Sociological Methods & Research, 22, 376-398.

    z McDonald, R.P. (1994). The bilevel reticular action model for pathanalysis with latent variables. Sociological Methods & Research,22, 399-413.

    z Hox, J.J. (1993). Factor analysis of multilevel data. Gauging theMuthén model. In: J.H.L. Oud & R.A.W. van Blokland-Vogelesang(eds.). Advances in longitudinal and multivariate analysis in thebehavioural sciences. Nijmegen, NL: ITS.

    z Hox, J.J. (1995). Applied multilevel modeling. Amsterdam: TTPublikaties.(Hox references downloadable from http://www.fsw.ruu.nl/ms/jh)