What's FAiR?

FAiR is a package to enhance the ability to conduct Factor Analysis in R and provides some functionality that is not found in any other R package or other statistical program. FAiR implements a new way to estimate the factor analysis model called semi-exploratory factor analysis (SEFA) in addition to exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). The essence of SEFA is that the user specifies how many coefficients in the “loading” matrix are exactly zero, and the locations of these exact zeros are estimated along with the values of the nonzero parameters. FAiR uses a genetic algorithm called RGENOUD for restricted optimization, which makes it possible to extract or transform factors subject to inequality constraints on functions of multiple parameters.

How Does FAiR Relate to Existing R Packages?

Most open source software builds on other open source software and FAiR is no exception. The following table briefly explains the relationship between FAiR and other R packages

Package Relationship Usefulness
rgenoud Dependency Restricted optimization
gWidgetsRGtk2 Dependency Graphical User Interface
rrcov Dependency Minimum Covariance Determinant estimator
Matrix Dependency Storing big symmetric matrices with packed storage
methods and stats4 Dependency S4 classes
corpcor Suggests Shrinkage covariance estimator
mvnmle Suggests Estimate covariance among data with missingness
polycor Suggests Estimate covariance among continuous and ordinal variables
GPArotation Suggests Copied a lot of code and used to make starting values
nFactors Suggests Enhanced scree plots
Rgraphviz Suggests Make directed acyclic graphs
mvnormtest Suggests Test for multivariate normality
energy Suggests Test for multivariate normality
jit Suggests More speed
stats Modified code Reused a lot of code from factanal()
sem Modified code Reused a lot of code from several functions
psych Modified code Reused code from fa.graph()

A big thanks to everyone involved in the aforementioned R packages. My hope is that the links between FAiR and other packages will become more numerous and stronger in future versions.

How Does FAiR Relate to Other Factor Analysis Software?

There are already three main tools for factor analysis in R, namely factanal() in the stats package to extract EFA factors assuming the data are multivariate normal, the GPArotation package to transform factors in EFA, and the sem package for CFA assuming the data are multivariate normal. The primary ways in which FAiR adds to this collection are that it has discrepancy functions that do not assume the data are multivariate normal, it permits inequality restrictions on functions of multiple parameters, and it implements SEFA.

In comparison to software outside of R, FAiR differs in three primary ways. First, FAiR is primarily licensed under the Affero General Public License. Outside of R and TETRAD, there are no open-source tools for factor analysis and many of the widely used packages for factor analysis are commercial.

Second, FAiR tries to (re)implement the philosophy of factor analysis outlined in a book by Allen Yates, which has the goal of restoring (exploratory) factor analysis’ original scientific purpose of making inferences about how outcomes relate to latent factors in the population, rather than being merely a descriptive tool for describing how outcomes relate to each other within a sample. A few software packages implement Yates’ geomin criterion for factor transformation, but the geomin criterion was only one component of Yates’ perspective on factor analysis. FAiR tries to incorporate all of Yates’ insights, but the implementations in FAiR differ from Yates’ algorithms.

Third, FAiR includes functionality that is not available anywhere else and would not be feasible without a genetic optimization algorithm. In particular, the ability to estimate SEFA models is unique to FAiR, and FAiR is also the only factor analysis software that can impose inequality restrictions on functions of multiple parameters, which is a very useful and promising feature that can be used to implement many of Yates’ ideas.

On the other hand, FAiR has only been in development for a few months whereas other programs for factor analysis often have been in development for many years. Thus, FAiR currently lacks some features that are available in other software and probably has bugs of varying magnitudes. Hopefully, both of these shortcomings will be remedied in time.

Yes, But Can FAiR Recover the Dimensions of Cardboard Boxes?

Not only can FAiR solve Thurstone’s box problem, it is the first ever to solve Thurstone’s box problem via the analytic criterion Thurstone originally proposed to numerically characterize simple structure. Although the box problem can be solved by several criteria, their success usually depends either on weighting schemes or good starting values. RGENOUD makes it possible to find the global optimum of Thurstone’s criterion (Φ) subject to the constraint that factor collapse is to be avoided without weighting schemes and for any reasonable starting values. There is a bit of hackery below because we do not have Thurstone’s original covariance matrix and cannot reextract his three factors with Factanal, but here is the demonstration using Rotate:

library(FAiR) 
 
## Get initial solution to Thurstone's box problem
 # Taken from data(Thurstone, package = "GPArotation")
box26 <- rbind(
c(0.629, -0.494, 0.579),
c(0.751, 0.602, 0.125),
c(0.765, -0.230, -0.572),
c(0.866, 0.131, 0.459),
c(0.873, -0.473, -0.042),
c(0.906, 0.250, -0.323),
c(0.824, -0.149, 0.528),
c(0.859, 0.358, 0.306),
c(0.812, -0.518, 0.203),
c(0.951, -0.441, -0.254),
c(0.876, 0.406, -0.185),
c(0.885, 0.095, -0.431),
c(-0.102, -0.936, 0.322),
c(0.102, 0.936, -0.322),
c(-0.081, -0.163, 0.969),
c(0.081, 0.163, -0.969),
c(0.006, 0.810, 0.582),
c(-0.006, -0.810, -0.582),
c(0.852, 0.223, 0.420),
c(0.861, -0.483, -0.094),
c(0.912, 0.248, -0.304),
c(0.847, 0.218, 0.405),
c(0.845, -0.456, -0.106),
c(0.902, 0.246, -0.272),
c(0.987, -0.026, 0.043),
c(0.965, 0.057, -0.028))
 
noise <- diag(runif(nrow(box26), max = .02))
Sigma <- tcrossprod(box26) + noise
man <- make_manifest(covmat = Sigma)
res <- make_restrictions(man, factors = 3, model = "EFA")
efa <- Factanal(man, res, impatient = TRUE)
 
## Hack
efa@loadings[,,1:4] <- box26
efa@loadings[,,5] <- box26^2
 
## Now Rotate() using Thurstone's criterion with a restriction to prevent factor collapse
efa_rotated <- Rotate(efa, criteria = list("phi"), 
                      methodArgs = list(nfc_threshold = 0.3, c = 1.0))
 
coef(efa_rotated) # close to true loadings
efa_rotated@correlations[,,"PF"] # close to true primary factors
 
## Raise toast to Thurstone

That's Nice but SEFA is Logically Impossible

Not quite impossible LOL. See the example in the Factanal help file, which produces

 > show(sefa)
 
Call:
Factanal(manifest = man, restrictions = res)
 
Number of observations:  112
 
Discrepancy:  7.055898
 
Semi-exploratory factor analysis with  2 factors
All free factor intercorrelations are on the [-1,1] interval
 
All coefficients on the [ -1.5 , 1.5 ] interval
 
Zeros per factor
      A B
zeros 2 2
Mapping rule: default
 
Discrepancy function:  MLE
 
 6 degrees of freedom
 
 > summary(sefa)
 
Call:
Factanal(manifest = man, restrictions = res)
 
Point estimates (blanks, if any, are exact zeros):
        F1     F2            Uniqueness
general  0.405  0.467         0.449
picture         0.642         0.588
blocks  -0.009  0.889         0.218
maze            0.478         0.772
reading  0.938                0.120
vocab    0.844                0.288
 
F1       1.000  0.446
F2       0.446  1.000
 

Note that nothing dictated that the zeros would fall at [2,1], [4,1], [5,2], and [6,2]. The only requirement was that there would be two zeros in each column

There are more examples in the paper linked below.

Helpful Links

To PDFs involving FAiR

SEFAiR So Far, a paper explaining SEFA (needs an update soon)

Restrictions, Factor Analysis, and Genetic Algorithms, a short paper explaining how FAiR works.

FAiR Vignette, has screenshots of the GUI menus

Manual WARNING: Not very clear yet, particularly in the absence of having read the previous three PDFs.

To Launchpad where the trunk of FAiR is hosted

If you use Bazaar, you can branch FAiR by executing bzr branch lp:fair .

To ask questions about the code, statistics, or ideas in FAiR, click here.

To file a bug against the code, click here. In particular, shortcomings in the documentation should be registered here.

To request a feature (called a “blueprint” by Launchpad), click here.

To blogs and so forth

Interesting discussion between Ben Goodrich and Cosma Shalizi

Instructions for Installing FAiR

Installing FAiR is slightly more complicated than a typical R package due to FAiR’s rudimentary point-and-click menu system.

First, one needs to install R version 2.6.0 or later (go here for binaries or here for detailed instructions). R can then be started by clicking the R icon :R: in the program menu (Windows) or Finder (Mac) or (non-Windows) by opening a shell (e.g. bash) and executing the command R .

Second, one needs to install the GTK libraries. This step is derived from John Verzani’s website. :WINDOWS: On Windows, the easiest way to do so is to start R and execute

source("http://www.math.csi.cuny.edu/pmg/installpmg.R")

and follow the prompts and default options. Be sure to restart R when the installation is finished. :MACOS: On Mac, the easiest way to do so is to install this universal binary.

:LINUX: On Linux, the GTK libraries are probably already installed, even if you use a KDE-centric distro. If not, install them using your favorite package manager. On Debian-based distros, this can be accomplished with apt-get install r-cran-rgtk2

Finally, FAiR itself needs to be installed, along with its dependencies. The easiest way to do so on all platforms is to start R and execute

install.packages(c("gWidgetsRGtk2", "rgenoud", "rrcov", "Matrix"), dependencies = TRUE)
install.packages(c("corpcor", "mvnmle", "polycor", "nFactors", "mvnormtest", "energy",  "GPArotation"), dependencies = TRUE) # optional packages
install.packages("FAiR", dependencies = TRUE)

After following these steps, FAiR can be used, examples run, help file accessed from within R by executing the following

library(FAiR)
example(Factanal)
help("FAiR-package")

Good covariance matrices for playing around with are ability.cov, Harman23.cor, and Harman74.cor (all of which are in library(datasets) and typically loaded at run time)

Interface Issues

:MACOS: On Mac, FAiR will not work properly (i.e. crash) unless R is run via the X server. There are two ways to be safe. One is to start the R GUI by clicking the R icon in /Applications, then click the X icon in the top center of the R GUI, then execute library(FAiR). If you do not have the R GUI, then execute open -a X11.app in a terminal, then execute R, then library(FAiR). If you do not have X11 installed, it is quite difficult to use FAiR.

:WINDOWS: On Windows, I suggest that you disable buffering of output by pressing Ctrl-W or by unchecking Misc -> Buffered output. Doing so will allow you to watch the progress of the genetic algorithm without having to adjust the mouse to flush the buffer.

:LINUX: On Linux, if you use FAiR via ssh be sure to use the -X option, e.g. ssh -X myname@myserver . Otherwise, the GUI menus will cause a crash

 
packages\cran\fair.txt · Last modified: 2008/07/14
 
Recent changes RSS feed R Wiki powered by Driven by DokuWiki and optimized for Firefox Creative Commons License