### Step 2 of 5 -- Identification of the SEM Model

The content in this article on step two of Structural Equation Modeling is summarized from the book by R. H. Hoyle (ed.) 1995. *Structural Equation Modeling. SAGE Publications, Inc.* courtesy of Google Books, from StatSoft (the electronic statistics textbook by the creators of *STATISTICA* data analysis and software services), and also on the skillful writing by Ricka Stoelting, who was a graduate student at San Francisco State University at the time she wrote about SEM.

In this step, the first thing that the market researcher considers is whether a **unique value** can be obtained for **every free parameter** from the observed data. In fact, the free parameters are estimated from data that is observable.

A quick refresher: A **free parameter** in SEM will have a value other than zero. A **fixed** parameter generally has a value of zero, which means that the variables are not related.

To help discriminate which are the free parameter paths and which are the fixed parameter paths, an asterisk is used to indicate the paths of free parameters.

**Constrained Parameter** -- Some parameters are **constrained** because they are set equal to another parameter. **Data points** are equivalent to the number of variances and covariances. Data points are not the same as observed variables.

**Estimation** of the model is the next step in the five step SEM process. **Hypothesis testing** is the fourth step in the SEM process. In order for the model to be estimated and in order to test the hypotheses among the variables, the model must be over-identified. **Over-identification** can only occur if the number of observed variables in the model exceed the number of data points (the number of variances and covariances).

**Some Little Known Facts About SEM**

The five steps to creating a SEM model seem quite straight-forward, however, the mathematical underpinnings that enable the modeling are extremely complicated. A strucural equation model can only ever be an approximation of reality.

Structural models utilize linear relations. The operable word is "relations." Linear relations are mathematical in nature and don't necessarily reflect the realities of the real world. Generally, a nonlinear pattern describes the relationships between most variables. This reminder from SoftStat, 2011 is key: "The real question is not so much, 'Does the model fit perfectly?' but rather, 'Does it fit well enough to be a useful approximation to reality, and a reasonable explanation of the trends in our data?'

All models that fit the data will not be correct, or "true." It is not possible to prove that a model is correct. A different model might also fit the data -- just as well -- and we are left not knowing which is accurate, or true. The logic behind this goes something like this: If Sylvester is a parrot, Sylvester has feathers. But just because Sylvester has feathers does not mean that Sylvester is a parrot. Sylvester could be a Myna bird or a Toucan or a canary. Using the same pattern of reasoning, it is apparent that a market researcher can assert that, "If a particular model is true, it will fit the data in my research project." However, the SEM model that fits my data might not be the correct model. Any number of models could fit the data in the market research project, and only one of them is "true" -- in the absolute sense of the word. But if the market researchers thinks of the "little known fact" directly above, than the researcher recall that linearity is not the rule in the real world. So the market researcher must settle for finding a reasonable fit for the data set.

### How Is Path Analysis Used?

A path diagram is used to show the relations between variables -- especially which variable cause changes in other variables. Variables can be latent or manifest. Latent variables are placed in the ovals of a path analysis schematic. Latent variables cannot be observed directly. Instead, the values of latent variables must be implied through their relationship with the observed variables. Two or more measured variables must be used to determine a value for a latent variable. Measured variables (also called **manifest variables**), which are observable, are placed in the boxes of the path diagram. In the path diagram, independent variables will have arrows that point to the dependent variables. The path diagram is said to be **isomorphic**.

Consideration of the various components of the path analysis and the structured equation model so far have allowed us to "identify" the model. All the variables have been entered into the path analysis and we have an isomorphic representation. Now it is time to review Step 3.

- Specify the Model
**Identify the Model**- Estimate the Model
- Test the Model Fit
- Manipulate the Model