psy 512 structural equation modeling - self and ... · pdf file2/26/2017 1 psy 512: advanced...
TRANSCRIPT
2/26/2017
1
PSY 512: Advanced Statistics for
Psychological and Behavioral Research 2
�What is SEM?
�When should we use SEM?
�What can SEM tell us?
�SEM Terminology and Jargon
�Technical Issues
�Types of SEM Models
�Limitations of SEM
� Combination of path analysis and factor analysis
• Path analysis: concerned with the predictive ordering of
measured variables
� X �Y � Z
• Factor analysis: concerned with latent factors (i.e.,
unmeasured variables)
� Latent construct of “intelligence”
� SEM is also known as:
• Causal modeling
• Covariance structure analysis
• Latent variable modeling
• “LISREL” modeling
• Simultaneous equations
2/26/2017
2
�Psychologists focus a great deal of attention on latent variables• We are unable to measure these directly so we must rely
on measurable indicators
�An operationalization of data as an abstract construct • A data reduction method that uses “regression like”
equations
• Takes many variables and attempts to explain them with a one or more “factors”
• Correlated variables are grouped together and separated from other variables with little or no correlation
Latent
Variable
Variable 1 Variable 2 Variable 3 Variable 4
Intelligence
Raven’s
Matrices
Miller
Analogies
Wechsler
Adult
Intelligence
Scale
Wonderlic
Test
Intelligence is a “factor”
because it is an
unmeasured variable that is
hypothesized to cause the
covariation among a set of
measured variables
The measured variables
(e.g., Raven’s Matrices)
are referred to as
“indicators”
2/26/2017
3
Latent Variable (Factors, Constructs)
Measured Variable (Observed Variable, Indicator Variable, Manifest Variable)
Regression Weight or Factor Loading
Covariance
� Endogenous Factor: A factor that the model attempts to explain
(i.e., has one or more straight lines with arrows pointed at it)
� Exogenous Factor: A factor that the model does not attempt to
explain (i.e., no straight lines with arrows pointed at it)
� Induced Variable: An unmeasured variable (like a factor) that is
believed to be the result of other variables in the model
• Example: Stress (latent variable) may be determined by features such as role
ambiguity, negative life events, and social conflict
� Full Structural Equation Model: Combines a measurement model
and a structural model (includes associations between unmeasured
and measured variables)
• Measurement Model: Relates unmeasured variables to measured variables
(similar to factor analysis)
• Structural Model: Summarizes relationships between unmeasured variables
(similar to path analysis)
2/26/2017
4
Locus of Control,
Self-Esteem,
Overall
Satisfaction, and
Loneliness are
factors
Locus of Control
and Loneliness
are exogenous
factors…Self-
Esteem and
Overall
Satisfaction are
endogenous
factors
Planunhy,
planwork, and
planbett are
indicators of
Locus of Control
Each of the other
factors has its
own set of
indicators
The empty ovals
represent various
types of error
that influence the
indicators and
the endogenous
factors
2/26/2017
5
The four straight
lines connecting
the factors
represent
hypothesized
direct effects of
one factor on
another
The curved two-
headed arrow
connecting
Loneliness and
Locus of Control
indicates a
relationship
between those
factors
� η (eta): Dependent (endogenous) variable that is unmeasured
(unobserved, latent)
� y: Indicator of dependent variable that is measured
(observed, manifest)
� ξ (xi or ksi): Independent (exogenous) variable that is
unmeasured (unobserved, latent)
� x: Indicator of independent variable that is measured
(observed, manifest)
� ε (epsilon): Error in observed dependent variable
� δ (delta): Error in observed independent variable
� ζ (zeta): Sources of variance in η not included among the ξs
(disturbances, error in equations, unexplained error in model)
� λy (lambda): Coefficients relating unmeasured dependent variables
to measured dependent variables
� λx (lambda): Coefficients relating unmeasured independent
variables to measured independent variables
� β (beta): Coefficients interrelating unmeasured dependent variables
� γ (gamma): Coefficients relating unmeasured independent variables
to unmeasured dependent variables
� φ (phi): Variances and covariances among unmeasured independent
variables
� ψ (psi): Variances and covariances among disturbances
� Θε (theta): Variances and covariances among errors in measured
dependent variables
� Θδ (theta): Variances and covariances among errors in measured
independent variables
2/26/2017
6
�The variables should be measured at either the interval or ratio scale
�The variables should have multivariate normal distributions
�The model should be correctly specified (e.g., relevant variables should be included and the direction of causal flow should be correct)
�SEM requires large samples for reliable results• More participants are needed for more complex
models, models with weak relationships, models with few measured variables, and models with variables that have nonnormal distributions
�The first step in modeling is the specification of a model• This should be based as much as possible on
previous knowledge
• If theory suggests competing models, then they should each be specified
�After a model is specified, the next step is to obtain parameter estimates (i.e., estimates of the coefficients representing direct effects, variances, and covariances)• This is accomplished using SEM software such as
AMOS
�AMOS will determine the estimates that will
most nearly reproduce the matrix of
observed relationships (i.e., correlation
matrix)
2/26/2017
7
The SEM program will determine the estimates that most nearly reproduce the
matrix of observed correlations. For example, Lifesat and Planunhp have a
correlation of -.12. The SEM program will look at the various ways of connecting
those two variables in your model and try to reproduce the observed correlation
of -.12 (while doing the same thing for every other observed association)
The numbers beside
the lines represent the
magnitude of the
effects. The numbers
at the tails of the
arrows represent the
variance of errors. In
this model, Self-
Esteem has a positive
association with
Overall Satisfaction
whereas Loneliness
has a negative
association with
Overall Satisfaction
� Evaluating SEM involves the following…
• Theoretical Criteria
� The model parameters should be assessed from a theoretical perspective (e.g.,
are the signs and magnitudes of the coefficients consistent with what is known
from previous research?)
• Statistical Criteria
� Identification status of the model (i.e., is there a unique solution for each
parameter in the model?)
� The reasonableness of the parameters (e.g., negative variances or correlations
greater than 1 will alert you to problems)
• Assessment of Fit
� The goal of SEM is to produce estimates that most nearly reproduce the
relationships in the original correlation matrix
� Fit indices determine how well the model matches the observed relationships
� Researchers want fit indices that indicate a good fit (e.g., statistics that say any
differences that appear could have occurred by chance)
2/26/2017
8
� Comparative Fit Index (CFI): Compares the proposed
model to an independence model (where nothing is
related)
� Root Mean Square Error of Approximation (RMSEA):
Compares the estimated model to a saturated or perfect
model
� Other popular fit indices• Normed Fit Index (NFI)
• Goodness of Fit Index (GFI)
• Incremental Fit Index (IFI)
• Non-normed Fit Index (NNFI)
• Akaike Information Criterion (AIC)
• Bayesian Information Criterion (BIC)
• Chi-square
�After examining the results of your SEM model,
you may decide to modify the model
• Many SEM programs will provide suggestions about
alterations to the model
�These modifications are “data-driven” and
there is the risk of capitalizing on chance (e.g.,
fitting the model to the peculiarities of one
particular sample)
• The probability values (p values) associated with a
model including data-driven modifications are not
accurate (to an unknown extent)
�We have discussed full structural equation
models so far…but there are other models
that are useful
• Simple submodels
� Confirmatory factor analysis (CFA)
� Path analysis
• Advanced extensions
� Longitudinal models
� Multisample models
2/26/2017
9
�CFA involves only a measurement model• Only models the direct effects of the factors on the
measured variables, the covariances among the factors,
and the errors of measurement
• Does not include specification of a structural model that
relates the factors to each other
�Unlike Exploratory Factor Analysis (EFA), CFA
allows the researcher to specify that particular
factors affect (or load on) particular measured
variables whereas all factors affect all
measured variables in EFA
�Path analysis only involves measured
variables
• No latent variables are involved
�There is a structural model without a
traditional measurement model (i.e., all of
the coefficients are fixed at 1)
2/26/2017
10
�SEM is very useful for the analysis of
longitudinal data such as panel data (i.e.,
measurements for the same individuals at
two or more time points)
• Allows us to develop a clearer idea about temporal
sequencing than can be drawn from a single point in
time
• With three or more time points it is possible to
estimate models with cross-lagged and synchronous
effects
2/26/2017
11
�An SEM model can be fit to two or more
groups simultaneously which allows for an
assessment of difference in the degree of fit
between the groups
• Example: The same model linking intelligence and
mental rotation abilities could be fit for men and
women simultaneously. This would allow researchers
to see if the model fit better for one sex than the
other
�Recursive
• Direction of influence one direction
� No reciprocal causation
� No feedback loops
• Disturbances not correlated
�Nonrecursive
• Either reciprocal causation, feedback loops, or
correlated disturbances
y2x1 y3
y3
y2
x3
x1
y1
x2
Model 1
Model 2
2/26/2017
12
x2 y1
x1 y2
y3
y2
x3
x1y1
x2
Model 1
Model 2
� SEM is a confirmatory approach
• Researchers need to have hypotheses about the relationships
• SEM is not “causal modeling”
� SEM is “correlational” but it can be used with experimental data
• Mediation and manipulation can be tested
� SEM is a sophisticated technique but it does not make up for
poor research design
� Biggest limitation is sample size
• It needs to be large to get stable estimates of the covariances/correlations
• It generally needs at least 200 participants for even small models
• A minimum of 10 participants are required for each estimated parameter
� SEM has proven to be a very versatile tool
� One strength of SEM is the requirement of prior knowledge
of the phenomena under examination
• In practice, this means that the researcher is testing a model that is
based on an exact and explicit plan or design
• Very complex and multidimensional structures can be measured
with SEM
• SEM is the only linear analysis method that allows complete and
simultaneous tests of all relationships
2/26/2017
13
� Limitations of SEM• Researchers must be very careful with the study design when using
SEM for exploratory work (i.e., SEM does not equal “causal
modeling”)
• SEM is complex and it is often done poorly
� A lot of jargon without a clear understanding
• Overgeneralization is always a problem and this is certainly true for
SEM
• SEM is based on covariances which are only stable when estimated
from large samples (i.e., at least 200 observations)
� …but really large samples can cause problems for some of the fit
statistics (e.g., leads to significant χ2)
• SEM programs allow calculation of modification indices which help
researcher to fit the model to the data
� ...but overfitting models to the data reduces generalizability!
�1. Work through the example in Amos • This can be done by following the instruction in the
Amos Tutorial (pp. 1-21)
• Print the path diagram and bring it to class
�2. Fit another model using data from another source• This can be data from your lab, found on the internet
(or other location), or fabricated
• The important thing is that you work through the process of SEM
• Print the path diagram, bring it to class, and be prepared to discuss what you did