--- title: "Aspects of Correlation Matrices" author: "Jan de Leeuw" date: "May 25, 2015" output: html_document bibliography: aspect.bib --- Suppose we have a dataframe with $n$ observations on $m$ variables, i.e. $m$ vectors in $\mathbb{R}^n$. We want to transform the $m$ variables in such a way that a particular _aspect_ of the correlation matrix of the variables is maximized. An _aspect_ is a real valued function $f$ defined on the space of correlation matrices. We formalize this using the framework discussed in De Leeuw [-@deleeuw_A_88a; -@deleeuw_C_88b]. Variable $j$ defines a convex cone $\mathcal{K}_j$ in $\mathbb{R}^n$. We suppose all vectors in the cone $\mathcal{K}_j$ are centered. Also define the sphere $\mathcal{S}$ of all vectors in $\mathbb{R}^n$ with sum of squares equal to one. For any choice of the $x_j\in\mathcal{K}_j\cap\mathcal{S}$ we can compute the correlation matrix $R(X)=\{r_{j\ell}(X)\}=x_j'x_\ell^{\ }.$ Our optimization problem is to maximize $f(R(X))$ over all $x_j\in\mathcal{K}_j\cap\mathcal{S}.$ In our software we define the cones $\mathcal{K}_j$ as the sets of all vectors $x$ such that $x=Zb$ and (optionally) $Ax\geq 0$, where $Z$ and $A$ are gven matrices. In @mair_deleeuw_A_10 the R package aspect is presented. It deals with the special case where the variables are categorical and the $Z$ are _indicator matrices_ (a.k.a. _dummies_). On page 20 in @mair_deleeuw_A_10 it says: > Considering each variable as categorical is not very efficient when having many categories, as typically > in the numerical case. Therefore, in a future update we will use splines to make it more efficient. Well, this is that future update. In this note we develop R software that handles the much more general case where the $Z$ are B-spline bases. The inequality constraints can be used to require that the spline transformations in $Zb$ are monotone with the data. This covers the categorical case, because spline bases of degree zero are indicator matrices. It also covers polynomial transformations by using splines without interior knots. And finally our software also allows for _nominal_ variables, by not imposing monotonicity restrictions. We now use a combination of tangential majorization and block relaxation [@deleeuw_E_15a, @deleeuw_E_15b] to construct an algorithm maximizing $f$. Consider the case in which $f$ is convex on the convex set of correlation matrices. Then $$f(R(X))\geq f(R(Y))+\mathbf{tr}\ G(Y)(R(X)-R(Y)),$$ where $G(Y):=\mathcal{D}f(R(Y)).$ In a majorization iteration we optimize $\mathbf{tr}\ G(Y)R(X).$ For this we use cyclic block relaxation, scaling one variable at a time while keeping the others fixed at their current values. Because $\mathbf{diag}(R(X))=\mathbf{diag}(R(Y))$ we solve subproblem $j$ in the cycle by optimizing $x_j't_j$ over $x_j\in\mathcal{K}_j\cap\mathcal{S}$, where $t_j$ is the _target_ $$t_j:=\sum_{\ell\not= j}^m g_{j\ell}(Y)x_\ell.$$ Of course $t_j$ depends on $Y$ and the $x_\ell$ with $\ell\not= j$. We find the new $x_j$ by minimizing $(t_j-Z_jb_j)'(t_j-Z_jb_j)$ over $A_jZ_jb_j\geq 0$ and by projecting the optimal solution $\hat x_j=Z_j\hat b_j$ on the sphere $\mathcal{S}.$ We assume throughout that $\hat x_j$ is non-zero, i.e. the target is not in the polar cone. In our R implementation we have incorporated some of the more obvious aspect choices that are convex functions of the correlation matrix: the sum of the dominant $p$ eigenvalues (principal component analysis), the squared multiple correlation of one variable with the others (multiple regression), the sum of all squared multiple correlations of one variable with the others (image analysis), the negative logarithm of the determinant, the sum of $p$th powers of the correlations (with suitable $p$), the sum of $p$th powers of the absolute values of the correlations. Thus our implementation covers both non-linear principal component analysis [@young_takane_deleeuw_A_78, @winsberg_ramsay_83, @koyak_87, @linting_meulman_groenen_vanderkooij_07, @deleeuw_C_14] and multiple regression with optimal scaling [@kruskal_65, @young_deleeuw_takane_A_76, @winsberg_ramsay_80, @vanderkooij_meulman_heiser_07]. And much more, obviously. ###Code {r aspect} aspect <- function (data, knots, degrees, ordinal, afunc, eps = 1e-6, itmax = 100, verbose = 1, ...) { m <- ncol (data) n <- nrow (data) itel <- 1 bd <- basesData (data, knots, degrees, ordinal) tdata <- matrix (0, n, m) for (j in 1:m) { tdata[,j] <- bd$x[[j]] } tdata <- apply (tdata, 2, function (z) z - mean (z)) tdata <- apply (tdata, 2, function (z) z / sqrt (sum (z ^ 2))) corr <- crossprod (tdata) af <- afunc(corr, ...) fold <- af$f g <- af$g repeat { for (j in 1:m) { target <- drop (tdata[,-j] %*% g[-j, j]) k <- bd$b[[j]] v <- bd$v[[j]] u <- drop (crossprod(k, target)) s0 <- sum(target * tdata[,j]) if (ordinal[j]) { ns <- nnls(v,u) rr <- residuals(ns) tt <- drop(k %*% rr) } else tt <- drop (k %*% u) tt <- tt - mean (tt) sq <- sum(tt ^ 2) if (sq > 1e-15) { tt <- tt / sqrt (sum (tt ^ 2)) tdata[,j] <- tt } s1 <- sum(target * tdata[,j]) if (verbose > 1) cat ( "**** Variable", formatC(j, width = 3, format = "d"), "Before", formatC( s0, digits = 8, width = 12, format = "f" ), "After", formatC( s1, digits = 8, width = 12, format = "f" ), "\n" ) corr <- crossprod (tdata) af <- afunc (corr, ...) fnew <- af$f g <- af$g } if (verbose > 0) cat( "Iteration: ", formatC (itel, width = 3, format = "d"), "fold: ", formatC ( fold, digits = 8, width = 12, format = "f" ), "fnew: ", formatC ( fnew, digits = 8, width = 12, format = "f" ), "\n" ) if ((itel == itmax) || ((fnew - fold) < eps)) break itel <- itel + 1 fold <- fnew } return (list ( tdata = tdata, f = fnew, r = corr, g = g )) }  The program uses the nnls package [@mullen_vanstokkum_12] for the inequality restricted regressions, with the dual algorithm described by De Leeuw [-@deleeuw_E_15c]. The inequality constraints always require monotonicity of the transformed values$Zb$with the original data. The function aspect starts out by calling the subroutine basesData, which constructs and stores orthonormalized B-spline bases, initial transformation estimates, and other auxilary quantities in lists. The subroutine uses an ad-hoc B-spline implementation written in C [@deleeuw_E_15e], and a similarly ad-hoc Gram-Schmidt in C [@deleeuw_E_15d]. For reproducibility purposes the C code is incorporated in this document. Note that we do not use monotone splines, such as the integrated B-splines of Winsberg and Ramsay [-@winsberg_ramsay_80, -@winsberg_ramsay_83], we merely require that our splines are monotone at the data points. {c splinebasis, echo = FALSE, results = "hide"} #include #include #include double bs (int nknots, int nspline, int degree, double x, double * knots); int mindex (int i, int j, int nrow); void splinebasis (int *d, int *n, int *m, double * x, double * knots, double * basis) { int mm = *m, dd = *d, nn = *n; int k = mm - dd - 1, i , j, ir, jr; for (i = 0; i < nn; i++) { ir = i + 1; if (x[i] == knots[mm - 1]) { basis [mindex (ir, k, nn) - 1] = 1.0; for (j = 0; j < (k - 1); j++) { jr = j + 1; basis [mindex (ir, jr, nn) - 1] = 0.0; } } else { for (j = 0; j < k ; j++) { jr = j + 1; basis [mindex (ir, jr, nn) - 1] = bs (mm, jr, dd + 1, x[i], knots); } } } } int mindex (int i, int j, int nrow) { return (j - 1) * nrow + i; } double bs (int nknots, int nspline, int updegree, double x, double * knots) { double y, y1, y2, temp1, temp2; if (updegree == 1) { if ((x >= knots[nspline - 1]) && (x < knots[nspline])) y = 1.0; else y = 0.0; } else { temp1 = 0.0; if ((knots[nspline + updegree - 2] - knots[nspline - 1]) > 0) temp1 = (x - knots[nspline - 1]) / (knots[nspline + updegree - 2] - knots[nspline - 1]); temp2 = 0.0; if ((knots[nspline + updegree - 1] - knots[nspline]) > 0) temp2 = (knots[nspline + updegree - 1] - x) / (knots[nspline + updegree - 1] - knots[nspline]); y1 = bs(nknots, nspline, updegree - 1, x, knots); y2 = bs(nknots, nspline + 1, updegree - 1, x, knots); y = temp1 * y1 + temp2 * y2; } return y; }  {c gs, echo = FALSE, results = "hide"} #include void gsc(double *x, double *q, double *r, int *n, int *m) { int i, j, l, jn, ln, jm, imax = *n, jmax = *m; double s = 0.0; for (i = 0; i < imax; i++) s += *(x + i) * *(x + i); s = sqrt(s); *r = s; for (i = 0; i < imax; i++) *(q + i) = *(x + i) / s; for (j = 1; j < jmax; j++) { jn = j * imax; jm = j * jmax; for (l = 0; l < j; l++) { ln = l * imax; s = 0.0; for (i = 0; i < imax; i++) s += *(q + ln + i) * *(x + jn + i); *(r + jm + l) = s; for (i = 0; i < imax; i++) *(q + jn + i) += s * *(q + ln + i); } for (i = 0; i < imax; i++) *(q + jn + i) = *(x + jn + i) - *(q + jn + i); s = 0.0; for (i = 0; i < imax; i++) s += *(q + jn + i) * *(q + jn + i); s = sqrt(s); *(r + jm + j) = s; for (i = 0; i < imax; i++) *(q + jn + i) /= s; } }  {r splines, echo = FALSE} bsplineBasis <- function (x, degree, innerknots, lowknot = min(x,innerknots), highknot = max(x,innerknots)) { innerknots <- unique (sort (innerknots)) knots <- c(rep(lowknot, degree + 1), innerknots, rep(highknot, degree + 1)) n <- length (x) m <- length (innerknots) + 2 * (degree + 1) nf <- length (innerknots) + degree + 1 basis <- rep (0, n * nf) res <- .C("splinebasis", d = as.integer(degree), n = as.integer(n), m = as.integer (m), x = as.double (x), knots = as.double (knots), basis = as.double(basis)) basis <- matrix (res$basis, n, nf) basis <- basis[,which(colSums(basis) > 0)] return (basis) }  {r gram, echo = FALSE} gsrc<-function(x) { n<-dim(x)[1]; m<-dim(x)[2]; q<-matrix(0,n,m); r<-matrix(0,m,m) qr<-.C("gsc",as.double(x),as.double(q),as.double(r),as.integer(dim(x)[1]), as.integer(dim(x)[2])) return(list(q=matrix(qr[[2]],n,m),r=matrix(qr[[3]],m,m))) }  {r basesdata} basesData <- function (data, knots, degrees, ordinal) { m <- ncol (data) n <- nrow (data) b <- as.list (rep (0, m)) v <- as.list (rep (0, m)) x <- as.list (rep (0, m)) for (j in 1:m) { b[[j]] <- bsplineBasis(data[[j]], degrees[j], knots[[j]]) nb <- ncol(b[[j]]) x[[j]] <- drop (b[[j]] %*% (1 : nb)) b[[j]] <- apply (b[[j]], 2, function (z) z - mean (z)) b[[j]] <- (gsrc (b[[j]])$q)[,1 : (nb - 1), drop=FALSE] if (ordinal[j]) { v[[j]] <- matrix (0, nb - 1, n - 1) r <- rank (data[[j]], ties.method = "first") for (i in 1 : (n - 1)) { v[[j]][, i] <- b[[j]][which(r == i),] - b[[j]][which(r == (i + 1)),] } } } return (list (b = b, v = v, x = x)) }  Here are the implementations of various aspects. Of course the user can add her own aspect by writing a similar function. {r various} maxeig <- function (r, p) { e <- eigen (r) f <- sum (e$values[1:p]) g <- tcrossprod(e$vectors[,1:p]) return (list (f = f, g = g)) } maxcor <- function (r, p) { f <- sum (r ^ p) g <- p * (r ^ (p - 1)) return (list (f = f, g = g)) } maxabs <- function (r, p) { f <- sum (abs(r)^ p) g <- p * (abs(r) ^ (p - 1))*sign(r) return (list (f = f, g = g)) } maxdet <- function (r) { f <- -log(det (r)) g <- -solve(r) return (list (f = f, g = g)) } maxsmc <- function (r, p) { beta <- solve (r[-p, -p], r[p, -p]) f <- sum (beta * r[p, -p]) h <- rep (1, nrow (r)) h[-p] <- -beta g <- -outer (h, h) return (list (f = f, g = g)) } maxsum <- function (r, p) { f <- sum (sqrt (r ^ 2 + p)) g <- r / sqrt (r ^ 2 + p) return (list (f = f, g = g)) } maximage <- function (r) { n <- nrow(r) f <- 0 g <- matrix (0, n, n) for (p in 1:n) { beta <- solve (r[-p,-p], r[p,-p]) f <- f + sum (beta * r[p,-p]) h <- rep (1, nrow (r)) h[-p] <- -beta g <- g - outer (h, h) } return (list (f = f, g = g)) }  ###Example 1: Gibbs-Neumann Data {r example, echo=FALSE} library (nnls)  Our first example are the Neumann data used by Willard Gibbs in his study of mixtures of gases with convertible components. These data have also been analyzed in Gifi [-@gifi_B_90, p. 370-375], where there is more information about them. We choose the three knots at the quartiles. {r neumann} data(neumann, package = "homals") neumann_knots <- lapply (neumann, function (x) fivenum (x)[2:4]) neumann_degrees <- c(0,2,2) neumann_ordinal <-c(FALSE, TRUE, TRUE)  First we maximize the multiple correlation of the last variable (density) with the first two (temperature and pressure). Temperature is scaled as nominal and categorical, pressure and density are ordinal with spline degree 2. {r neumann_regress} hreg<-aspect(neumann,neumann_knots,neumann_degrees,neumann_ordinal,maxsmc,p=3)  This gives the correlation matrix {r regcor, echo = FALSE} hreg$r  and the optimal transformations (with vertical lines at the knots) {r regplots,fig.align="center", echo = FALSE} plot(neumann[,1], hreg$tdata[,1], col="RED", xlab="temperature", ylab="transform") abline(v=neumann_knots[[1]][1]) abline(v=neumann_knots[[1]][2]) abline(v=neumann_knots[[1]][3]) plot(neumann[,2], hreg$tdata[,2], col="RED", xlab="pressure", ylab="transform") abline(v=neumann_knots[[2]][1]) abline(v=neumann_knots[[2]][2]) abline(v=neumann_knots[[2]][3]) plot(neumann[,3], hreg$tdata[,3], col="RED", xlab="density", ylab="transform") abline(v=neumann_knots[[3]][1]) abline(v=neumann_knots[[3]][2]) abline(v=neumann_knots[[3]][3])  Next we compute a one-dimensional PCA by maximizing the largest eigenvalue (using the same knots and degrees). {r neumann_pca} hpca<-aspect(neumann,neumann_knots,neumann_degrees,neumann_ordinal,maxeig,p=1)  The correlation matrix now is {r pcacor, echo = FALSE} hpca$r  and the optimal transformations are {r pcaplots,fig.align="center", echo = FALSE} plot(neumann[,1], hpca$tdata[,1], col="RED", xlab="temperature", ylab="transform") abline(v=neumann_knots[[1]][1]) abline(v=neumann_knots[[1]][2]) abline(v=neumann_knots[[1]][3]) plot(neumann[,2], hpca$tdata[,2], col="RED", xlab="pressure", ylab="transform") abline(v=neumann_knots[[2]][1]) abline(v=neumann_knots[[2]][2]) abline(v=neumann_knots[[2]][3]) plot(neumann[,3], hpca$tdata[,3], col="RED", xlab="density", ylab="transform") abline(v=neumann_knots[[3]][1]) abline(v=neumann_knots[[3]][2]) abline(v=neumann_knots[[3]][3])  Next, we maximize what is perhaps the simplest aspect: the sum of the correlations. {r neumann_sum} hsum<-aspect(neumann,neumann_knots,neumann_degrees,neumann_ordinal,maxcor,p=1)  This gives the correlation matrix {r corcor} hsum$r  The sum of the absolute values of the correlations. {r neumann_abs} habs<-aspect(neumann,neumann_knots,neumann_degrees,neumann_ordinal,maxabs,p=1)  In this case the correlation matrix after the last iteration is {r abscor, echo = FALSE} habsr  and the optimal transformations are {r absplots,fig.align="center", echo = FALSE} plot(neumann[,1], habstdata[,1], col="RED", xlab="temperature", ylab="transform") abline(v=neumann_knots[[1]][1]) abline(v=neumann_knots[[1]][2]) abline(v=neumann_knots[[1]][3]) plot(neumann[,2], habs$tdata[,2], col="RED", xlab="pressure", ylab="transform") abline(v=neumann_knots[[2]][1]) abline(v=neumann_knots[[2]][2]) abline(v=neumann_knots[[2]][3]) plot(neumann[,3], habs$tdata[,3], col="RED", xlab="density", ylab="transform") abline(v=neumann_knots[[3]][1]) abline(v=neumann_knots[[3]][2]) abline(v=neumann_knots[[3]][3])  ###Example 2: Bodyfat Data Age, weight, height, and 10 body circumference measurements are recorded for 252 men. Each man's percentage of body fat was accurately estimated by an underwater weighing technique. {r bodyfat} data(fat, package = "faraway") bodyfat <- fat bodyfat_knots <- lapply (bodyfat, function (x) fivenum (x)[2:4]) bodyfat_degrees <- rep(2, 18) bodyfat_ordinal <- rep (TRUE, 18) bodyfat_ordinal[4]<-FALSE names(bodyfat)  The first two variables are bodyfat indices, computed with different formulas. We see how well we can predict the first one using the multiple correlation aspect. All variables are transformed monotonically, except age. {r bodysmc} bsmc<-aspect(bodyfat,bodyfat_knots,bodyfat_degrees,bodyfat_ordinal,maxsmc,p=1)  We plot the optimal transformation for the brozek index and for age. {r bodysmcplot, fig.align="center", echo = FALSE} plot(bodyfat[,1], bsmc$tdata[,1], col="RED", xlab="brozek", ylab="transform") nknots <- length (bodyfat_knots[[1]]) for (i in 1:nknots) abline(v = bodyfat_knots[[1]][i]) plot(bodyfat[,4], bsmc$tdata[,4], col="RED", xlab="age", ylab="transform") nknots <- length (bodyfat_knots[[4]]) for (i in 1:nknots) abline(v = bodyfat_knots[[4]][i])  We can use nonlinear principal component analysis to look more closely at the structure of the variables defining the indices. Optimize the sum of the two largest eigenvalues of the correlation matix of the last 16 variables. {r beig} beig<-aspect(bodyfat[,3:18], bodyfat_knots[3:18], bodyfat_degrees[3:18], bodyfat_ordinal[3:18], maxeig, p = 2)  The eigenvalues of the correlation matrix, in percentages, with a plot of their cumulative values, are as follows. {r beigeig, fig.align="center", echo = FALSE} e <- eigen(beig$r) v <- e$values v/16 plot(1:16,cumsum(v)/16, col="RED", type="l", lwd=3, xlab="eigenvalues", ylab="cumulative percentage") for (i in 1:16) abline(v = i)  The scatterplot of the first two principal components is as follows. {r scat, fig.align = "center", echo = FALSE} s<-e$vectors[,1:2]%*%diag(sqrt(v[1:2])) plot(s, type="n", xlab="dimension 1", ylab="dimension 2") text(s, names(fat)[3:18], col="RED") abline(v=0) abline(h=0)  ###Example 3: Air Pollution The Sokal/Rolph air pollution data have previously been used by @wang_murphy_04 to illustrate ACE [@breiman_friedman_85]. The outcome variable is sulpher dioxide concentration in air mgs. per cubic metre in 41 cities in the USA. The predictors are * average annual temperature in degrees F * number of manufacturers employing >20 workers * population size in thousands * average annual wind speed in miles per hour * average annual rainfall in inches * average number of days rainfall per year {r airload} data(usair, package = "gamlss.data") usair_knots <- lapply (usair, function (x) fivenum (x)[2:4]) usair_degrees <- c (3,3,3,3,3,3,3) usair_ordinal <- c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,FALSE)  {r airsmc} airsmc<-aspect(usair,usair_knots,usair_degrees,usair_ordinal,verbose=0, itmax = 1000,maxsmc,p=1)  The optimal SMC is r airsmc$f (for ACE this is 0.9469), the correlation matrix is {r usaircor, echo = FALSE} airsmc$r  and the regression coefficients are {r usairbeta, echo = FALSE} r <- airsmc$r solve (r[2:7,2:7],r[1,2:7])  The seven transformation plots are as follows. {r usairplot, fig.align="center", echo = FALSE} lbs <- c("sulpher","temperature","enterprises","population size","wind speed","annual precipitation","precipitation days") plot(usair[,1], airsmc$tdata[,1], col="RED", xlab=lbs[1], ylab="transform") par(mfrow=c(3,2)) for (i in 2:7) { plot(usair[,i], airsmc$tdata[,i], col="RED", xlab=lbs[i], ylab="transform") nknots <- length (usair_knots[[i]]) for (k in 1:nknots) abline(v = usair_knots[[i]][k]) } par(mfrow=c(1,1))  We also use this example to show that an empty vector of interior knots leads to a polynomial transformation. In this case we transform the dependent variable linearly (which basically means we do not transform it at all). {r airlin} usair_knots[[1]]<-numeric(0) usair_degrees[1]<-1 airlin<-aspect(usair,usair_knots,usair_degrees,usair_ordinal,verbose=0,maxsmc,p=1)  The optimal SMC is now r airlin$f, the correlation matrix is {r usairlin, echo = FALSE} airlin$r  and the regression coefficients are {r uslinbeta, echo = FALSE} r <- airlinr solve (r[2:7,2:7],r[1,2:7])  The seven transformation plots are as follows. {r uslinplot, fig.align="center", echo = FALSE} lbs <- c("sulpher","temperature","enterprises","population size","wind speed","annual precipitation","precipitation days") plot(usair[,1], airlintdata[,1], col="RED", xlab=lbs[1], ylab="transform") par(mfrow=c(3,2)) for (i in 2:7) { plot(usair[,i], airlin$tdata[,i], col="RED", xlab=lbs[i], ylab="transform") nknots <- length (usair_knots[[i]]) for (k in 1:nknots) abline(v = usair_knots[[i]][k]) } par(mfrow=c(1,1))  ### Example 4: Moral Integration of American Cities The angell data frame has 43 rows and 4 columns. The observations are 43 U. S. cities around 1950. These data were previously analyzed with ACE by @devaux_89. The variables are * Moral Integration: Composite of crime rate and welfare expenditures. * Ethnic Heterogenity: From percentages of nonwhite and foreign-born white residents. * Geographic Mobility: From percentages of residents moving into and out of the city. * A factor with levels: Northeast; Midwest; Southeast; West. {r angell} data(Angell, package = "car") angell <- Angell[,1:3] angell_knots <- lapply (angell, function (x) fivenum (x)[2:4]) angell_degrees <- c(2,2,2) angell_ordinal <- c(TRUE, TRUE, TRUE)  {r angsmc} hsmc<-aspect(angell,angell_knots,angell_degrees,angell_ordinal,maxsmc,p=1)  Correlations between transformed variables are as follows. {r angcor, echo = FALSE} hsmc$r  The transformation plots are as follows. {r angplot, fig.align="center", echo = FALSE} lbs <- c("moral","hetero","mobility","region") plot(angell[,1], hsmc$tdata[,1], col="RED", xlab=lbs[1], ylab="transform") par(mfrow=c(1,2)) for (i in 2:3) { plot(angell[,i], hsmc$tdata[,i], col="RED", xlab=lbs[i], ylab="transform") nknots <- length (angell_knots[[i]]) for (k in 1:nknots) abline(v = angell_knots[[i]][k]) } par(mfrow=c(1,1))  ###Example 5: Thirteen Personality Scales We illustrate a more complicated aspect using a small data set from the psych package [@revelle_15] of five scales from the Eysenck Personality Inventory, five from a Big 5 inventory, a Beck Depression Inventory, and State and Trait Anxiety measures. {r epidata} data(epi.bfi, package = "psych") epi <- epi.bfi epi_knots <- lapply (epi, function (x) fivenum (x)[2:4]) epi_degrees <- rep (2, 13) epi_ordinal <- rep (TRUE, 13)  The aspect we use is the maximum log-likelihood of a factor analysis model. Because this aspect is the maximum of a family of functions linear in R it is indeed convex. Obviously the same result applies to any structural equation model combined with a multinormal log-likelihood. Using the negative logarithm of the determinant as an aspect corresponds with the special case of a saturated model. {r maxfac} maxfac <- function (r, p) { fa <- factanal (NULL, p, covmat = r, rotation = "none") s <- tcrossprod (fa$loadings) + diag (fa$unique) g <- - solve (s) f <- sum (g * r) - log(det (s)) return (list (f = f, g = g)) }  We fit a two-factor model, without worrying if two factors are appropriate for these data. {r epifac} efac<-aspect(epi,epi_knots,epi_degrees,epi_ordinal,maxfac,p=2)  The transformation plots are as follows. {r epiplot, fig.align="center", echo = FALSE} lbs <- c("epiE","epiS","epiImp","epiLie","epiNeur","bfagree","bfcon","bfext","bfneur","bfopen","bdi","traitanx","statanx") par(mfrow=c(1,5)) for (i in 1:13) { plot(epi[,i], efactdata[,i], col="RED", xlab=lbs[i], ylab="transform") nknots <- length (epi_knots[[i]]) for (k in 1:nknots) abline(v = epi_knots[[i]][k]) } par(mfrow=c(1,1))  Here are the factor loadings, for those so inclined. {r corcircle, fig.align = "center", echo = FALSE} x<-seq(0,2*pi,length=1000) s<-cbind(sin(x),cos(x)) plot(s, type = "n", xlim = c(-1,1), ylim = c(-1,1), xlab = "factor 1", ylab = "factor 2", asp = 1 ,xaxs = "i") lines(s, col = "RED", lwd =2) fa <- factanal (NULL, 2, covmat = efacr, rotation = "none") lx <- fa$loadings text(lx, lbs, cex = .75) for (i in 1:13) arrows (0, 0, lx[i,1], lx[i,2], col = "BLUE", length = .1)  And here are the uniquenesses. Variable 1 (epiE) seems to be heading for Heywood. {r funique, echo = FALSE} fa$uniquenesses  ###References