Bài giảng Factor Analysis (cont) - Tài liệu, ebook, giáo trình, hướng dẫn

These patterns can be viewed from two perspectives. One can look at the pattern of variation of nations across their characteristics, and then group the nations by their profile similarity. One might group together nations which are all high on GNP per capita, low on trade, high on power, etc. When applied to discern patterns of profile similarity of individuals, groups, or nations, the analysis is called Q-factor analysis.

16 trang | Chia sẻ: haohao89 | Lượt xem: 2272 | Lượt tải: 1

Bạn đang xem nội dung tài liệu Bài giảng Factor Analysis (cont), để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên

Factor Analysis (cont) Steps in conducting a factor analysis There are four basic factor analysis steps: Data collection and generation of the correlation matrix Extraction of initial factor solution Rotation and interpretation Construction of scales or factor scores to use in further analyses Extraction of an initial solution The output of a factor analysis will give you several things. The table below shows how output helps to determine the number of components/factors to be retained for further analysis. One good rule of thumb for determining the number of factors, is the "eigenvalue greater than 1" criteria. For the moment, let's not worry about the meaning of eigenvalues, however this criteria allows us to be fairly sure that any factors we keep will account for at least the variance of one of the variables used in the analysis. However, when applying this rule, keep in mind that when the number of variables is small, the analysis may result in fewer factors than "really" exist in the data, while a large number of variables may produce more factors meeting the criteria than are meaningful. There are other criteria for selecting the number of factors to keep, but this is the easiest to apply, since it is the default of most statistical computer programs. Sample extraction of components/factors Unrotated Factor Matrix Rotated Factor Matrix Naming the factors Now we have a highly interpretable solution, which represents almost 90% of the data. The next step is to name the factors. There are a few rules suggested by methodologists: Factor names should be brief, one or two words communicate the nature of the underlying construct Look for patterns of similarity between items that load on a factor. If you are seeking to validate a theoretical structure, you may want to use the factor names that already exist in the literature. Otherwise, use names that will communicate your conceptual structure to others. In addition, you can try looking at what items do not load on a factor, to determine what that factor isn't. Also, try reversing loadings to get a better interpretation. Let us look at a concrete example. Table 1 presents information on fourteen nations for ten characteristics. The nations are selected to reflect major regional, political, economic, and cultural groupings; the characteristics reflect different facets of each nation, including domestic instability and foreign conflict. The table thus contains 14 X 10, or 140 pieces of information for 1955. Factor analysis addresses itself to this question: "What are the patterns of relationship among these data?" These patterns can be viewed from two perspectives. One can look at the pattern of variation of nations across their characteristics, and then group the nations by their profile similarity. One might group together nations which are all high on GNP per capita, low on trade, high on power, etc. When applied to discern patterns of profile similarity of individuals, groups, or nations, the analysis is called Q-factor analysis. The regularity in the data of Table 1 can be looked at from a second perspective, however. The focus now is the patterns of variation of characteristics. In Table 1, for example, nations high on GNP per capita also appear low on trade and power. There is a regularity, therefore, in the nation values on these three characteristics, and this regularity is described as a pattern of variation. Many of our social concepts define such patterns. For example, the concept of 11 economic development" involves (among other things) GNP per capita, literacy, urbanization, education, and communication; it is a pattern because these characteristics are highly intercorrelated with each other. Factor analysis applied to delineate patterns of variation in characteristics is called R-factor analysis. To obtain this output: File, Open, point to gss 93 subset.sav. Statistics, Data Reduction, Factor Analysis In the Factor Analysis dialog box, enter all the variables listed above in the "Variables" box. Click on the Descriptives button and check Coefficients, and Significance Levels. Click on the Extraction button and under Display check Unrotated Factor Matrix and Scree Plot. Leave as defaults the settings for Analyze Correlation Matrix and Extract Eigenvalues over 1. Click on the Rotation button and select Varimax. Under Display, check Rotated Solution and Loading Plots. Click on the Scores button and check :Display factor score coefficient matrix". Click on the Options button and .check "Coefficient Display Format, Sorted by Size". Click on OK to run the procedure. Another example Factor analysis is a data reduction technique for identifying the internal structure of a set of variables. Unlike other techniques like Regression analysis or ANOVA, factor analysis does not require that the predictor and criterion variables be defined. Factor analysis attempts to identify the relationship between all variables included in the analysis set. The variables included in the analysis have a portion of their variance explained by certain underlying common dimensions, called the factors. Factor analysis helps in identifying this set of k dimensions underlying the m variables in a data set (where k < m). For discussion purposes, consider the following five variable data set later used for the Factor program. 79652 55462 12345 16523 46525 79665 65321 98653 46521 65435 32165 56523 65454 16589 98965 73195 15937 35079 62486 46428 This data represents the scores (0 to 9 scale) of 20 students on five finals (e.g. Math, English, History, Geography, Science). Can we say that the students' exam grades in the different subjects are related? The relationship between the student grades are not directly measureable, but are in fact latent. Grades in different courses could be related because of the student's intellectual capabilities, memory capacity, or just interest. Although it should be noted that the test grades of one person may not be completely correlated with one another, we can conclude that the grades in all subject areas should depend to some degree on the general intelligence or other factors common to the learning of the subject material. Accordingly, we may identify one or more factors that explain the `common' portion of the variance in the original raw scores.