The basic structure for all solutions is the same, we see the familiar *horseshoe*, in which the left-right scale is bended so that extreme left (CPN) and extreme right (BP) are close. This is perhaps most clear for r = .75. If r becomes large, then we see increasing clustering. We also give all six solutions plotted on top of each other (after Procrustus matching) in figure 9. It shows that the two extreme parties are stable over solutions, but the middle is more variable.

Plotting rStress in figure 3 as a function of r shows, somewhat surprisingly perhaps, an increasing function. Because the data are average dissimilarities, it is likely there is a fairly large additive constant, and additive constants correspond with lower values of r.

We put the nuber of iterations and the rStress values in a small table. ``` ## r: 0.10 iterations: 29103 rStress: 0.005464 ## r: 0.25 iterations: 3605 rStress: 0.006310 ## r: 0.50 iterations: 3566 rStress: 0.044603 ## r: 0.75 iterations: 3440 rStress: 0.107113 ## r: 1.00 iterations: 100000 rStress: 0.155392 ## r: 2.00 iterations: 100000 rStress: 0.234877 ``` Figure 4 shows the six Shepard diagrams. We see the clustering for r = 2. For r = .1 the Shepard diagram becomes concave, indicating that the larger dissimilarities are underestimated and reflecting the fact that for small powers the powered distances will all be close to one.

##Ekman Color Data We use the color data from @ekman_54, with two dimensions and unit weights. As we know from previous analyses, MDS of the Ekman data gives a very good fit and a very clear representation of the color circle. ``` ## 434 445 465 472 490 504 537 555 584 600 610 628 651 ## 445 0.14 ## 465 0.58 0.50 ## 472 0.58 0.56 0.19 ## 490 0.82 0.78 0.53 0.46 ## 504 0.94 0.91 0.83 0.75 0.39 ## 537 0.93 0.93 0.90 0.90 0.69 0.38 ## 555 0.96 0.93 0.92 0.91 0.74 0.55 0.27 ## 584 0.98 0.98 0.98 0.98 0.93 0.86 0.78 0.67 ## 600 0.93 0.96 0.99 0.99 0.98 0.92 0.86 0.81 0.42 ## 610 0.91 0.93 0.98 1.00 0.98 0.98 0.95 0.96 0.63 0.26 ## 628 0.88 0.89 0.99 0.99 0.99 0.98 0.98 0.97 0.73 0.50 0.24 ## 651 0.87 0.87 0.95 0.98 0.98 0.98 0.98 0.98 0.80 0.59 0.38 0.15 ## 674 0.84 0.86 0.97 0.96 1.00 0.99 1.00 0.98 0.77 0.72 0.45 0.32 0.24 ``` In our algorithm we always start with a classical metric scaling solution [@torgerson_58]. We minimize rStress for r equal to 0.1, 0.25, 0.5, 0.75, 1, 2. The six solutions are plotted jointly in Figure 5. Because of the very good fit, solutions for different values of r are similar and very much on the color circle.

The Shepard diagrams in figure 6 are interesting, because, unlike the configurations, they are quite different, becoming more and more convex as r increases. The plot for r = .25 looks best.

We plot rStress as a function of r in Figure 7. The best fit is attained for r = .25, which means fitting square roots of Euclidean distances to dissimilarities. Nonmetric MDS of the Ekman data has already indicated that the optimal nonmetric fit is attained with a concave increasing transformation.

``` ## r: 0.10 iterations: 100000 rStress: 0.017839 ## r: 0.25 iterations: 1361 rStress: 0.001910 ## r: 0.50 iterations: 535 rStress: 0.017213 ## r: 0.75 iterations: 3343 rStress: 0.054769 ## r: 1.00 iterations: 13749 rStress: 0.093063 ## r: 2.00 iterations: 100000 rStress: 0.181719 ``` Especially when $r$ is far from $\frac12$ it will make a difference if we minimize $\sigma_r$ or the $\tilde\sigma_r$ of $\eqref{E:tilde}$. We illustrate this by choosing $r=.01$, which will presumably take us close to the logarithm.

After 14837 iterations we find a $\tilde\sigma_r$ value of 0.000012. We see a rather clear deviation from the circular pattern we found with rStress. #Code ```r torgerson <- function(delta, p = 2) { doubleCenter <- function(x) { n <- dim(x)[1] m <- dim(x)[2] s <- sum(x) / (n * m) xr <- rowSums(x) / m xc <- colSums(x) / n return((x - outer(xr, xc, "+")) + s) } z <- eigen(-doubleCenter((as.matrix (delta) ^ 2) / 2), symmetric = TRUE) v <- pmax(z$values, 0) return(z$vectors[, 1:p] %*% diag(sqrt(v[1:p]))) } enorm <- function (x, w = 1) { return (sqrt (sum (w * (x ^ 2)))) } sqdist <- function (x) { s <- tcrossprod (x) v <- diag (s) return (outer (v, v, "+") - 2 * s) } mkBmat <- function (x) { d <- rowSums (x) x <- -x diag (x) <- d return (x) } mkPower <- function (x, r) { n <- nrow (x) return (abs ((x + diag (n)) ^ r) - diag(n)) } matchConf <- function (x, eps = 1e-6, itmax = 100, verbose = TRUE) { m <- length (x) n <- nrow (x[[1]]) p <- ncol (x[[1]]) itel <- 1 dold <- Inf repeat { y <- matrix (0, n, p) for (k in 1:m) y <- y + x[[k]] y <- y / m dnew <- 0 for (k in 1:m) { s <- svd (crossprod(x[[k]], y)) x[[k]] <- tcrossprod (x[[k]] %*% (s$u), s$v) dnew <- dnew + sum ((y - x[[k]]) ^ 2) } if (verbose) { cat ( formatC (itel, width = 4, format = "d"), formatC ( dold, digits = 10, width = 13, format = "f" ), formatC ( dnew, digits = 10, width = 13, format = "f" ), "\n" ) } if ((itel == itmax) || ((dold - dnew) < eps)) break () itel <- itel + 1 dold <- dnew } return (x) } rStressMin <- function (delta, w = 1 - diag (nrow (delta)), p = 2, r = 0.5, eps = 1e-10, itmax = 100000, verbose = TRUE) { delta <- delta / enorm (delta, w) itel <- 1 xold <- torgerson (delta, p = p) xold <- xold / enorm (xold) n <- nrow (xold) nn <- diag (n) dold <- sqdist (xold) rold <- sum (w * delta * mkPower (dold, r)) nold <- sum (w * mkPower (dold, 2 * r)) aold <- rold / nold sold <- 1 - 2 * aold * rold + (aold ^ 2) * nold repeat { p1 <- mkPower (dold, r - 1) p2 <- mkPower (dold, (2 * r) - 1) by <- mkBmat (w * delta * p1) cy <- mkBmat (w * p2) ga <- 2 * sum (w * p2) be <- (2 * r - 1) * (2 ^ r) * sum (w * delta) de <- (4 * r - 1) * (4 ^ r) * sum (w) if (r >= 0.5) { my <- by - aold * (cy - de * nn) } if (r < 0.5) { my <- (by - be * nn) - aold * (cy - ga * nn) } xnew <- my %*% xold xnew <- xnew / enorm (xnew) dnew <- sqdist (xnew) rnew <- sum (w * delta * mkPower (dnew, r)) nnew <- sum (w * mkPower (dnew, 2 * r)) anew <- rnew / nnew snew <- 1 - 2 * anew * rnew + (anew ^ 2) * nnew if (verbose) { cat ( formatC (itel, width = 4, format = "d"), formatC ( sold, digits = 10, width = 13, format = "f" ), formatC ( snew, digits = 10, width = 13, format = "f" ), "\n" ) } if ((itel == itmax) || ((sold - snew) < eps)) break () itel <- itel + 1 xold <- xnew dold <- dnew sold <- snew aold <- anew } return (list (x = xnew, alpha = anew, sigma = snew, itel = itel)) } ``` #NEWS 001 01/12/16 -- First published version, without code and examples 002 01/12/16 -- Added artificial example and code. 003 01/13/16 -- Squashed two nasties 004 01/13/16 -- Added Ekman Example 005 01/13/16 -- Many small edits 006 01/13/16 -- Reorganized proofs 007 01/13/16 -- Added political parties example 008 01/13/16 -- Added configuration matching 009 01/13/16 -- Added Ekman power solution 010 01/14/16 -- Text in examples -- Additional plots 011 01/14/16 -- small change in theorems and code #References