Where is data reduction in spss




















F, communality is unique to each item shared across components or factors , 5. After deciding on the number of factors to extract and with analysis model to use, the next step is to interpret the factor loadings.

Factor rotations help us interpret factor loadings. There are two general types of rotations, orthogonal and oblique. The goal of factor rotation is to improve the interpretability of the factor solution by reaching simple structure.

Without rotation, the first factor is the most general factor onto which most items load and explains the largest amount of variance. This may not be desired in all cases. Suppose you wanted to know how well a set of items load on each factor; simple structure helps us to achieve this. For the following factor matrix, explain why it does not conform to simple structure using both the conventional and Pedhazur test.

Using the Pedhazur method, Items 1, 2, 5, 6, and 7 have high loadings on two factors fails first criterion and Factor 3 has high loadings on a majority or 5 out of 8 items fails second criterion. We know that the goal of factor rotation is to rotate the factor matrix so that it can approach simple structure in order to improve interpretability. Orthogonal rotation assumes that the factors are not correlated.

The benefit of doing an orthogonal rotation is that loadings are simple correlations of items with factors, and standardized solutions can estimate the unique contribution of each factor. The most common type of orthogonal rotation is Varimax rotation. We will walk through how to do this in SPSS.

First, we know that the unrotated factor matrix Factor Matrix table should be the same. Additionally, since the common variance explained by both factors should be the same, the Communalities table should be the same.

The main difference is that we ran a rotation, so we should get the rotated solution Rotated Factor Matrix as well as the transformation used to obtain the rotation Factor Transformation Matrix. Finally, although the total variance explained by all factors stays the same, the total variance explained by each factor will be different.

The Rotated Factor Matrix table tells us what the factor loadings look like after rotation in this case Varimax. Kaiser normalization is a method to obtain stability of solutions across samples. After rotation, the loadings are rescaled back to the proper size. This means that equal weight is given to all items when performing the rotation.

The only drawback is if the communality is low for a particular item, Kaiser normalization will weight these items equally with items with high communality. As such, Kaiser normalization is preferred when communalities are high across all items.

You can turn off Kaiser normalization by specifying. Here is what the Varimax rotated loadings look like without Kaiser normalization. Compared to the rotated factor matrix with Kaiser normalization the patterns look similar if you flip Factors 1 and 2; this may be an artifact of the rescaling.

The biggest difference between the two solutions is for items with low communalities such as Item 2 0. Kaiser normalization weights these items equally with the other high communality items. In the both the Kaiser normalized and non-Kaiser normalized rotated factor matrices, the loadings that have a magnitude greater than 0.

We can see that Items 6 and 7 load highly onto Factor 1 and Items 1, 3, 4, 5, and 8 load highly onto Factor 2. Item 2 does not seem to load highly on any factor.

The figure below shows the path diagram of the Varimax rotation. Comparing this solution to the unrotated solution, we notice that there are high loadings in both Factor 1 and 2.

This is because Varimax maximizes the sum of the variances of the squared loadings, which in effect maximizes high loadings and minimizes low loadings. In SPSS, you will see a matrix with two rows and two columns because we have two factors.

How do we interpret this matrix? How do we obtain this new transformed pair of values? The steps are essentially to start with one column of the Factor Transformation matrix, view it as another ordered pair and multiply matching ordered pairs.

We have obtained the new transformed pair with some rounding error. The figure below summarizes the steps we used to perform the transformation.

The Factor Transformation Matrix can also tell us angle of rotation if we take the inverse cosine of the diagonal element. Notice that the original loadings do not move with respect to the original axis, which means you are simply re-defining the axis for the same loadings. This makes sense because if our rotated Factor Matrix is different, the square of the loadings should be different, and hence the Sum of Squared loadings will be different for each factor. However, if you sum the Sums of Squared Loadings across all factors for the Rotation solution,.

This is because rotation does not change the total common variance. Looking at the Rotation Sums of Squared Loadings for Factor 1, it still has the largest total variance, but now that shared variance is split more evenly.

Varimax rotation is the most popular orthogonal rotation. The benefit of Varimax rotation is that it maximizes the variances of the loadings within the factors while maximizing differences between high and low loadings on a particular factor. Higher loadings are made higher while lower loadings are made lower.

This makes Varimax rotation good for achieving simple structure but not as good for detecting an overall factor because it splits up variance of major factors among lesser ones.

Quartimax may be a better choice for detecting an overall factor. It maximizes the squared loadings so that each item loads most strongly onto a single factor. Here is the output of the Total Variance Explained table juxtaposed side-by-side for Varimax versus Quartimax rotation. You will see that whereas Varimax distributes the variances evenly across both factors, Quartimax tries to consolidate more variance into the first factor. Equamax is a hybrid of Varimax and Quartimax, but because of this may behave erratically and according to Pett et al.

Like orthogonal rotation, the goal is rotation of the reference axes about the origin to achieve a simpler and more meaningful factor solution compared to the unrotated solution.

In oblique rotation, you will see three unique tables in the SPSS output:. Suppose the Principal Investigator hypothesizes that the two factors are correlated, and wishes to test this assumption. The other parameter we have to put in is delta , which defaults to zero.

Larger positive values for delta increases the correlation among factors. In fact, SPSS caps the delta value at 0. Negative delta may lead to orthogonal factor solutions. F, larger delta values, 3. The factor pattern matrix represent partial standardized regression coefficients of each item with a particular factor.

Just as in orthogonal rotation, the square of the loadings represent the contribution of the factor to the variance of the item, but excluding the overlap between correlated factors.

The figure below shows the Pattern Matrix depicted as a path diagram. Remember to interpret each loading as the partial correlation of the item on the factor, controlling for the other factor. The more correlated the factors, the more difference between pattern and structure matrix and the more difficult to interpret the factor loadings.

Looking at the Factor Pattern Matrix and using the absolute loading greater than 0. In the Factor Structure Matrix, we can look at the variance explained by each factor not controlling for the other factors. In general, the loadings across the factors in the Structure Matrix will be higher than the Pattern Matrix because we are not partialling out the variance of the other factors.

The figure below shows the Structure Matrix depicted as a path diagram. Remember to interpret each loading as the zero-order correlation of the item on the factor not controlling for the other factor. Recall that the more correlated the factors, the more difference between Pattern and Structure matrix and the more difficult it is to interpret the factor loadings.

In our case, Factor 1 and Factor 2 are pretty highly correlated, which is why there is such a big difference between the factor pattern and factor structure matrices. Observe this in the Factor Correlation Matrix below. The difference between an orthogonal versus oblique rotation is that the factors in an oblique rotation are correlated. The angle of axis rotation is defined as the angle between the rotated and unrotated axes blue and black axes.

Initial Eigenvalues — Eigenvalues are the variances of the principal components. Because we conducted our principal components analysis on the correlation matrix, the variables are standardized, which means that the each variable has a variance of 1, and the total variance is equal to the number of variables used in the analysis, in this case, Total — This column contains the eigenvalues.

The first component will always account for the most variance and hence have the highest eigenvalue , and the next component will account for as much of the left over variance as it can, and so on.

Hence, each successive component will account for less and less variance. For example, the third row shows a value of This means that the first three components together account for Remember that because this is principal components analysis, all variance is considered to be true and common variance.

In other words, the variables are assumed to be measured without error, so there is no error variance. Extraction Sums of Squared Loadings — The three columns of this half of the table exactly reproduce the values given on the same row on the left side of the table.

The number of rows reproduced on the right side of the table is determined by the number of principal components whose eigenvalues are 1 or greater. The scree plot graphs the eigenvalue against the component number. You can see these values in the first two columns of the table immediately above.

From the third component on, you can see that the line is almost flat, meaning the each successive component is accounting for smaller and smaller amounts of the total variance. In general, we are interested in keeping only those principal components whose eigenvalues are greater than 1.

Components with an eigenvalue of less than 1 account for less variance than did the original variable which had a variance of 1 , and so are of little use. Hence, you can see that the point of principal components analysis is to redistribute the variance in the correlation matrix using the method of eigenvalue decomposition to redistribute the variance to first components extracted.

Component Matrix — This table contains component loadings, which are the correlations between the variable and the component.

This makes the output easier to read by removing the clutter of low correlations that are probably not meaningful anyway. Notify me of new comments via email. Notify me of new posts via email. Click to expand. Share this: Twitter Facebook. Like this: Like Loading Leave a Reply Cancel reply Enter your comment here Fill in your details below or click an icon to log in:.

Email required Address never made public. Name required. If you are unsure how to interpret the results from these tests, we show you in our enhanced PCA guide, which is part of our enhanced content again, you can learn more about our enhanced content on our Features: Overview page. Assumption 4: Your data should be suitable for data reduction.

Effectively, you need to have adequate correlations between the variables in order for variables to be reduced to a smaller number of components. Interpretation of this test is provided as part of our enhanced PCA guide. Assumption 5: There should be no significant outliers. Outliers are important because these can have a disproportionate influence on your results. SPSS Statistics recommends determining outliers as component scores greater than 3 standard deviations away from the mean.

SPSS Statistics Example A company director wanted to hire another employee for his company and was looking for someone who would display high levels of motivation , dependability , enthusiasm and commitment i.

First, take a look through these seven steps: Step 1: You need to interpret the results from your assumption tests to make sure that you can use PCA to analyse your data. This includes analysing: a the scatterplots that you should have created to check for the linearity of your variables Assumption 2 ; b sampling adequacy , based on the Kaiser-Meyer-Olkin KMO Measure of Sampling Adequacy for the overall data set and the KMO measure for each individual variable Assumption 3 ; c data suitable for reduction with Bartlett's test of sphericity Assumption 4 ; and d the standard deviations of components scores to check for significant outliers Assumption 5.

Step 2: You need to inspect the initial extraction of components. At this point, there will be as many components as there are variables. You should focus on the Initial Eigenvalues to get an initial sense of the major components you have extracted and how much of the total variance each component explains.

However, at this stage, you should not only be aware that you don't have sufficient information to select components, but also that the output produced is based on default options in SPSS Statistics i.

Step 3: You need to determine the number of 'meaningful' components that you want to retain. To do this, you have a number of options: a use the eigenvalue-one criterion the SPSS Statistics default ; b use the proportion of total variance accounted for; c use the scree plot test ; or d use the interpretability criterion. You need to consider why you would use one of these options over another, as well as the implications that these choices might have for the number of components that are extracted.

You also have to consider the type of rotation you selected - whether Varimax, Direct Oblimin, Quartimax, Equamax or Promax - and how this will influence how your components 'load' onto different variables.

The goal is to achieve a 'simple structure' ; that is, a structure where you have a readily explainable division of variables onto separate components, with a component loading onto at least three variables. This simply involves a number of additional steps where you instruct SPSS Statistics to retain a specific number of components i.

You will then have to reanalyse your data accordingly i.



0コメント

  • 1000 / 1000