``` ## [,1] [,2] ## [1,] 0.93 0.16 ## [2,] 0.16 0.03 ## [,1] [,2] ## [1,] 0.96 0.04 ## [2,] 0.04 0.64 ## [,1] [,2] ## [1,] 0.94 0.07 ## [2,] 0.07 0.68 ## [,1] [,2] ## [1,] 0.40 -0.13 ## [2,] -0.13 0.04 ## [,1] [,2] ## [1,] 0.29 -0.18 ## [2,] -0.18 0.81 ## [,1] [,2] ## [1,] 0.07 0.04 ## [2,] 0.04 0.03 ``` ``` ## [,1] [,2] ## [1,] 0.60 0.00 ## [2,] 0.00 0.37 ``` ``` ## [1] 0.5157326 ``` ``` ## [1] 1.031465 ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] ## [1,] 1.00 0.00 -0.66 -0.75 0.82 -0.58 0.46 0.00 -0.41 0.03 -0.22 ## [2,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ## [3,] -0.66 0.00 1.00 -0.00 -0.71 0.14 -0.44 0.00 0.56 0.43 0.17 ## [4,] -0.75 0.00 -0.00 1.00 -0.45 0.65 -0.23 0.00 0.06 -0.41 0.14 ## [5,] 0.82 0.00 -0.71 -0.45 1.00 -0.00 0.51 0.00 -0.41 -0.37 -0.15 ## [6,] -0.58 0.00 0.14 0.65 -0.00 1.00 -0.07 0.00 0.13 -0.56 0.17 ## [7,] 0.46 0.00 -0.44 -0.23 0.51 -0.07 1.00 0.00 -0.28 -0.17 -0.29 ## [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ## [9,] -0.41 0.00 0.56 0.06 -0.41 0.13 -0.28 0.00 1.00 0.00 -0.24 ## [10,] 0.03 0.00 0.43 -0.41 -0.37 -0.56 -0.17 0.00 0.00 1.00 -0.09 ## [11,] -0.22 0.00 0.17 0.14 -0.15 0.17 -0.29 0.00 -0.24 -0.09 1.00 ## [12,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ## [,12] ## [1,] 0.00 ## [2,] 0.00 ## [3,] 0.00 ## [4,] 0.00 ## [5,] 0.00 ## [6,] 0.00 ## [7,] 0.00 ## [8,] 0.00 ## [9,] 0.00 ## [10,] 0.00 ## [11,] 0.00 ## [12,] 0.00 ``` ``` ## [1] 0.60 0.37 0.21 0.13 0.10 0.07 0.02 0.00 0.00 0.00 -0.00 ## [12] -0.00 ``` ###GALO ```r h <- homals (galo, degrees = rep (-1, 4)) ``` The four star plots are in figure 4.

### Thirteen Personality Scales Our next example is a small data set from the `psych` package [@revelle_15] of five scales from the Eysenck Personality Inventory, five from a Big Five inventory, a Beck Depression Inventory, and State and Trait Anxiety measures. ```r data(epi.bfi, package = "psych") epi <- epi.bfi epi_knots <- lapply (epi, function (x) fivenum (x)[2:4]) epi_degrees <- rep (0, 13) epi_ordinal <- rep (FALSE, 13) ``` We perform a two-dimensional MCA, using degree zero and inner knots at the three quartiles for all 13 variables. ```r h <- homals(epi, epi_knots, epi_degrees, epi_ordinal) ``` We have convergence in 260 iterations to loss 0.7478043. The object scores are in figure 3.

Figure 4 has the $G_jY_j$ for each of the thirteen variables, with the first dimension in red, and the second dimension in blue. Because the degree of the splines is zero, these _transformation plots_ show step functions, with the steps at the knots, which are represented by vertical lines.

The thirteen star plots are in figure 4.

Now change the degree to two for all variables, i.e. fit piecewise quadratic polynomials which are differentiable at the knots. We still have two copies for each variable, and these two copies define the sets. ```r epi_degrees <- rep (2, 13) h <- homals (epi, epi_knots, epi_degrees, epi_ordinal) ``` We have convergence in 785 iterations to loss 0.7179135. The object scores are in figure 6 and the transformation plots in figure 7.

#Correspondence Analysis and corals() # Nonlinear Principal Component Analysis and princals() Suppose all $m$ sets each contain only a single variable. Then the Burt matrix is the correlation matrix of the $H_j$, which are all $n\times 1$ matrices in this case. It follows that GMVA maximizes the sum of the $r$ largest eigenvalues of the correlation matrix over transformations, i.e. GMVA is _nonlinear principal component analysis_ [@deleeuw_C_14]. ## Example: Thirteen Personality Scales We use the same data as before for an NLPCA with all sets of rank one, all variables ordinal, and splines of degree 2. ```r library(nnls) epi_copies <- rep (1, 13) epi_ordinal <- rep (TRUE, 13) h <- princals(epi, epi_knots, epi_degrees, epi_ordinal, epi_copies) ``` In 19 iterations we find minimum loss 0.7330982. The object scores are in figure 8 and the transformation plots in figure 9. NLPCA maximizes the sum of the two largest eigenvalues of the correlation matrix of the variables. Before transformation the eigenvalues are 4.0043587, 2.6702003, 1.9970912, 0.8813983, 0.6571463, 0.6299946, 0.5246896, 0.4657022, 0.3457515, 0.3403361, 0.2767531, 0.1835449, 0.0230333, after transformation they are 4.1939722, 2.7454868, 1.604906, 0.8209072, 0.7184825, 0.677183, 0.51865, 0.4545214, 0.4200148, 0.351787, 0.2928574, 0.1699557, 0.0312759. The sum of the first two goes from 6.674559 to 6.9394591.

We repeat the analysis with ordinal variables of degree two, without interior knots. Thus we the transformation plots will be quadratic polynomials that are monotone over the range of the data. ```r h <- princals(epi, knotsE(epi), epi_degrees, epi_ordinal) ``` In 20 iterations we find minimum loss 0.7393666. The object scores are in figure 10 and the transformation plots in figure 11. The eigenvalues are now 4.0828642, 2.6936186, 1.8391342, 0.8732231, 0.6666505, 0.6491709, 0.5390077, 0.459182, 0.3632868, 0.3471175, 0.2845394, 0.1782232, 0.023982, with sum of the first two equal to 6.7764828.

#Optimal Scaling and primals() #Canonical Analysis and canals() If there are only two sets the generalized eigenvalue problem for the Burt matrix becomes $$ \begin{bmatrix} D_1&C_{12}\\C_{21}&D_2 \end{bmatrix} \begin{bmatrix} a_1\\a_2 \end{bmatrix}=2\lambda\begin{bmatrix}D_1&0\\0&D_2\end{bmatrix}\begin{bmatrix} a_1\\a_2 \end{bmatrix}, $$ which we can rewrite as $$ \begin{split} C_{12}a_2&=(2\lambda-1)D_1a_1,\\ C_{21}a_1&=(2\lambda-1)D_2a_2, \end{split} $$ from which we see that GMVA maximizes the sum of the $r$ largest canonical correlations between $H_1$ and $H_2$. See also @vandervelden_12. #Multiple Regression and morals() If the second set only contains a single copy of a single variable then we choose transformations that maximize the multiple correlation of that variable and the variables in the first set. ##Example:Polynomial Regression ```r x <- center(as.matrix (seq (0, pi, length = 20))) y <- center(as.matrix (sin (x))) h<- morals (x, y, xknots = knotsE(x), xdegrees = 3, xordinal = TRUE) plot(y, h$yhat) ``` ![](_main_files/figure-html/polynom_data-1.png) ```r plot(x, h$xhat) ``` ![](_main_files/figure-html/polynom_data-2.png) ```r plot (x, y) lines (x, h$ypred) ``` ![](_main_files/figure-html/polynom_data-3.png) ##Example:Gases with Convertible Components We analyze a regression example, using data from Neumann, previously used by Willard Gibbs, and analyzed with regression in a still quite readable article by @wilson_26. Wilson's analysis was discussed and modified using splines in @gifi_B_90 (pages 370-376). In the regression analysis in this section we use two copies of temperature, with spline degree zero, and the first copy ordinal. For pressure and the dependent variable density we use a single ordinal copy with spline degree two. ```r data (neumann, package = "homals") xneumann <- neumann[, 1:2] yneumann <- neumann[, 3, drop = FALSE] xdegrees <- c(0,2) ``` ```r h <- morals (xneumann, yneumann, xdegrees = c(0,2), xcopies = c(2,1)) ``` In 34 iterations we find minimum loss 0.0295045, corresponding with a multiple correlation of 0.8854642. The object scores are in figure 12 plotted against the original variables (not the transformed variables), and the transformation plots in are figure 13.

# Discriminant Analysis and criminals() If the second set contains more than one copy of a single variable and we use binary indicator coding for that variable, then we optimize the eigenvalue (between/within ratio) sums for a canonical discriminant analysis. ##Example: Iris data The next example illustrates (canonical) discriminant analysis, using the obligatory Anderson-Fisher iris data. Since there are three species of iris, we use two copies for the species variable. The other four variables are in the same set, they are transformed using piecewise linear monotone splines with five knots. ```r data(iris, package="datasets") iris_vars <- names(iris) iris[[5]] <- as.numeric (iris[[5]]) iris_knots <- as.list(1:5) for (i in 1:4) iris_knots[[i]] <- quantile (iris[[i]], (1:5) / 6) iris_knots[[5]] <- 1:3 iris_degrees <- c(1,1,1,1,0) iris_ordinal <- c (TRUE, TRUE, TRUE, TRUE, FALSE) iris_copies <- c (1,1,1,1,2) iris_sets <- c(1,1,1,1,2) ``` ```r h <- criminals (iris, iris_knots,iris_degrees,iris_ordinal,iris_sets,iris_copies) ``` In 34 iterations we find minimum loss 0.0295045. The object scores are in figure 14 plotted against the original variables (not the transformed variables), and the transformation plots are in figure 15.

Discriminant analysis decomposes the total dispersion matrix $T$ into a sum of a between-groups dispersion $B$ and a within-groups dispersion $W$, and then finds directions in the space spanned by the variables for which the between-variance is largest relative to the total variance. GMVA optimizes the sum of the $r$ largest eigenvalues of $T^{-1}B$. Before optimal transformation these eigenvalues for the iris data are `r `, after transformation they are `r `. #Multiset Canonical Correlation and overals() ##Example: Thirteen Personality Scales This is the same example as before, but now we group the five scales from the Eysenck Personality Inventory and the five from the Big Five inventory into sets. The remaining three variables define three separate sets. No cpies are used, and we use monotone cubic splines with the interior knots at the quartiles. ```r epi_knots <- lapply (epi, function (x) fivenum (x)[2:4]) epi_degrees <- rep (3, 13) epi_sets <- c(1,1,1,1,1,2,2,2,2,2,3,4,5) ``` ```r h <- overals(epi, epi_sets, epi_copies, epi_knots, epi_degrees, epi_ordinal) ``` ``` ## [1] 231 13 ``` In 196 iterations we find minimum loss 0.4724286. The object scores are in figure 16 and the transformation plots in figure 17.

\appendix #Code ##R Code ###Engine ###Wrappers ###Utilities ###Splines ###Gram-Schmidt ##C Code ###Splines ###Gram-Schmidt #NEWS #TO DO #References