We also compute the solution using loss function $\eqref{E:KC}$ with $\alpha=0$ instead of $\eqref{E:final}$. After 195 iterations we find loss 0.2881099728. The gradient at the solution is ``` ## [1] +0.000200 +0.000253 -0.000873 -0.000536 +0.000249 +0.000657 +0.000008 ## [8] -0.000246 -0.000397 +0.000029 +0.000128 -0.000136 -0.000189 -0.000239 ## [15] -0.000476 -0.000386 -0.000295 -0.000394 -0.000366 +0.000133 +0.000707 ## [22] +0.000183 -0.000630 +0.000642 -0.000330 -0.000550 -0.000017 -0.000087 ## [29] +0.000067 -0.000412 ``` The solution in figure 3 is virtually indistinguishable from the one in figures 1 and 2.

##Dutch Political Parties 1967 As the next illustration we use data from @degruijter_67, with average dissimilarity judgments between Dutch political parties in 1967. The data are ``` ## KVP PvdA VVD ARP CHU CPN PSP BP ## PvdA 5.63 ## VVD 5.27 6.72 ## ARP 4.60 5.64 5.46 ## CHU 4.80 6.22 4.97 3.20 ## CPN 7.54 5.12 8.13 7.84 7.80 ## PSP 6.73 4.59 7.55 6.73 7.08 4.08 ## BP 7.18 7.22 6.90 7.28 6.96 6.34 6.88 ## D66 6.17 5.47 4.67 6.13 6.04 7.42 6.36 7.36 ``` The process converges in 39 iterations to loss function value 0.0418615488. The gradient at the solution is ``` ## [1] -0.001614 +0.000297 -0.001651 -0.000921 -0.000015 +0.000113 -0.000842 ## [8] +0.000264 -0.000702 +0.000685 +0.002058 -0.000529 +0.000125 -0.000309 ## [15] +0.000951 +0.000694 ```

##Ekman Color Data The final example are the color data from @ekman_54. The process converges in 85 iterations to loss function value 0.000505895068078001. The gradient at the solution is ``` ## [1] +0.000097 -0.000009 +0.000064 +0.000004 -0.000056 +0.000086 +0.000002 ## [8] +0.000026 +0.000019 -0.000065 -0.000088 -0.000064 -0.000048 +0.000012 ## [15] +0.000012 -0.000029 -0.000014 +0.000061 +0.000007 +0.000103 -0.000117 ## [22] -0.000058 +0.000061 -0.000003 +0.000055 +0.000013 ```

#Discussion As an aside the Shepard loss function is similar to the loss function used in the Guttman-Lingoes programs (@guttman_68a, @lingoes_roskam_73). $$ \sigma_{GL}^\star(x):=\frac12\sum_{k=1}^K(d_k(x)-d_k^\star(x))^2=\sum_{k=1}^K(d_k(x)-d_k^\star(x))d_k(x), $$ where the $d_k^\star(x)$ are the *rank-images*, i.e. the $d_k(x)$ permuted in such a way that they are monotonic with the dissimilarities. Thus instead of `sort(delta)[rank(dist)]` we use `sort(dist)[rank(delta)]`. There is no arithmetic done with dissimilarities, only their rank order is used. But the rank images do not have any clear optimality properties. As a consequence they are considerably less smooth than loss based on monotone regression or on Shepard's rearrangements. The report @deleeuw_R_73g is of interest in this context, in part because the pdf includes a copy of the report agressively annotated by Joseph Kruskal, as well as some obfuscating correspondence with Jim Lingoes. Also see Appendix 2 of @kruskal_77. If one wants to use rank images, it would make more sense to use $$ \sigma^\star_{LG}:=\sum_{k=1}^K(d_k^\star-d_k)\delta_k, $$ because, again, by the rearrangement inequality this is a directionally differentiable difference of two convex functions similar to the one used by Shepard. Shepard himself *evaluates* the departure from monotonicity, after convergence, using $$ \frac12\sum_{k=1}^K(\delta_k-\hat\delta_k)^2=\sum_{k=1}^K(\delta_k-\hat\delta_k)\delta_k=\sum_{k=1}^K(\hat\delta_k-\delta_k)\hat\delta_k, $$ which is again hopelessly non-smooth. #Appendix: Code ##shepard.R ```r suppressPackageStartupMessages (library (mgcv, quietly = TRUE)) source("auxilary.R") source("mdsUtils.R") source("smacofShepard62.R") ``` ##auxilary.R ```r mprint <- function (x, d = 6, w = 8, f = "") { print (noquote (formatC ( x, di = d, wi = w, fo = "f", flag = f ))) } directSum <- function (x) { m <- length (x) nr <- sum (sapply (x, nrow)) nc <- sum (sapply (x, ncol)) z <- matrix (0, nr, nc) kr <- 0 kc <- 0 for (i in 1:m) { ir <- nrow (x[[i]]) ic <- ncol (x[[i]]) z[kr + (1:ir), kc + (1:ic)] <- x[[i]] kr <- kr + ir kc <- kc + ic } return (z) } repList <- function(x, n) { z <- list() for (i in 1:n) z <- c(z, list(x)) return(z) } shapeMe <- function (x) { m <- length (x) n <- (1 + sqrt (1 + 8 * m)) / 2 d <- matrix (0, n, n) k <- 1 for (i in 2:n) { for (j in 1:(i - 1)) { d[i, j] <- d[j, i] <- x[k] k <- k + 1 } } return (d) } symmetricFromTriangle <- function (x, lower = TRUE, diagonal = TRUE) { k <- length (x) if (diagonal) n <- (sqrt (1 + 8 * k) - 1) / 2 else n <- (sqrt (1 + 8 * k) + 1) / 2 if (n != as.integer (n)) stop ("input error") nn <- 1:n if (diagonal && lower) m <- outer (nn, nn, ">=") if (diagonal && (!lower)) m <- outer (nn, nn, "<=") if ((!diagonal) && lower) m <- outer (nn, nn, ">") if ((!diagonal) && (!lower)) m <- outer (nn, nn, "<") b <- matrix (0, n, n) b[m] <- x b <- b + t(b) if (diagonal) diag (b) <- diag(b) / 2 return (b) } triangleFromSymmetric <- function (x, lower = TRUE, diagonal = TRUE) { n <- ncol (x) nn <- 1:n if (diagonal && lower) m <- outer (nn, nn, ">=") if (diagonal && (!lower)) m <- outer (nn, nn, "<=") if ((!diagonal) && lower) m <- outer (nn, nn, ">") if ((!diagonal) && (!lower)) m <- outer (nn, nn, "<") return (x[m]) } ``` ##mdsUtils.R ```r torgerson <- function (delta, p = 2) { z <- slanczos(-doubleCenter((delta ^ 2) / 2), p) w <- matrix (0, p, p) v <- pmax(z$values, 0) diag (w) <- sqrt (v) return(z$vectors %*% w) } basisPrep <- function (n, w = rep (1, n * (n - 1) / 2)) { m <- n * (n - 1) / 2 v <- -symmetricFromTriangle (w, diagonal = FALSE) diag (v) <- -rowSums(v) ev <- eigen (v) eval <- ev$values[1:(n - 1)] evec <- ev$vectors[, 1:(n - 1)] z <- evec %*% diag (1 / sqrt (eval)) a <- array (0, c(n - 1, n - 1, m)) k <- 1 for (j in 1:(n-1)) { for (i in (j+1):n) { dif <- z[i,] - z[j,] a [, , k] <- outer (dif, dif) k <- k + 1 } } return (list (z = z, a = a)) } configurationInBasis <- function (x, z) { n <- nrow (x) p <- ncol (x) r <- p * (n - 1) y <- rep (0, r) for (s in 1:p) { k <- (s - 1) * (n - 1) + 1:(n - 1) y[k] <- crossprod (z, x[, s]) / diag (crossprod (z)) } return (y) } columnCenter <- function (x) { return (apply (x, 2, function (z) z - mean (z))) } doubleCenter <- function (x) { n <- nrow (x) j <- diag(n) - (1 / n) return (j %*% x %*% j) } squareDistances <- function (x) { d <- diag (x) return (outer (d, d, "+") - 2 * x) } ``` ##smacofShepard62.R ```r smacofShepard62 <- function (delta, p = 2, xini = torgerson (symmetricFromTriangle (delta, diagonal = FALSE), p), top = 100, norm = 1, itmax = 1000, eps = 1e-10, verbose = FALSE) { sdelta <- sort (delta) m <- length (delta) n <- (1 + sqrt (1 + 8 * m)) / 2 r <- p * (n - 1) h <- basisPrep (n, rep (1, m)) xold <- rep (0, r) for (s in 1:p) { k <- (s - 1) * (n - 1) + 1:(n - 1) xold[k] <- crossprod (h$z, xini[, s]) / diag (crossprod (h$z)) } d <- rep (0, m) itel <- 1 repeat { bmat <- matrix (0, n - 1, n - 1) xx <- matrix (xold, n - 1, p) for (k in 1:m) { ak <- h$a[, , k] d[k] <- sqrt (sum (xx * (ak %*% xx))) } sumd <- ifelse (norm == 1, sum (d), sqrt (sum (d ^ 2))) dhat <- sdelta[rank (d)] sold <- (sum (dhat * d) - sum (delta * d)) / sumd for (k in 1:m) { ak <- h$a[, , k] if (norm == 1) bmat <- bmat + ((dhat [k] - delta[k] - sold) / d[k]) * ak else bmat <- bmat + ((dhat [k] - delta[k]) / d[k] - sold / sumd ) * ak } bmat <- bmat / sumd grad <- as.vector (bmat %*% xx) hstp <- optimize ( gfunc, c(0, top), x = xold, grad = grad, delta = delta, sdelta = sdelta, a = h$a, norm = norm ) snew <- hstp$objective xnew <- xold - hstp$minimum * grad if (verbose) cat( "Iteration: ", formatC (itel, width = 3, format = "d"), "sold: ", formatC ( sold, digits = 8, width = 12, format = "f" ), "snew: ", formatC ( snew, digits = 8, width = 12, format = "f" ), "stepsize: ", formatC ( hstp$minimum, digits = 8, width = 12, format = "f" ), "\n" ) if ((itel == itmax) || ((sold - snew) < eps)) break xold <- drop (xnew) itel <- itel + 1 } xconf <- matrix (0, n, p) for (s in 1:p) { k <- (s - 1) * (n - 1) + 1:(n - 1) xconf[, s] <- h$z %*% xnew[k] } return (list ( delta = delta, dist = d, x = xconf, xvec = xnew, grad = grad, itel = itel, stress = snew )) } gfunc <- function (y, x, grad, delta, sdelta, a, norm) { m <- dim (a)[3] n <- dim (a)[1] p <- length (x) / n z <- x - y * grad xx <- matrix (z, n, p) d <- rep (0, m) for (k in 1:m) { ak <- a[, , k] d[k] <- sqrt (sum (xx * (ak %*% xx))) } dhat <- sdelta[rank (d)] sumd <- ifelse (norm == 1, sum (d), sqrt (sum (d ^ 2))) val <- (sum (dhat * d) - sum (delta * d)) / sumd return (val) } ``` #References