# Multidimensional Scaling with Distance Bounds Jan de Leeuw Version 12, January 16, 2017 Note: This is a working paper which will be expanded/updated frequently. All suggestions for improvement are welcome. The directory [gifi.stat.ucla.edu/updown](http://gifi.stat.ucla.edu/updown) has a pdf version, the complete Rmd file with all code chunks, the bib file, and the R source code. #Introduction In this paper we study the minimization of *stress*, defined as $$\mathbf{stress}(X):=\frac14\sum_{i=1}^n\sum_{j=1}^n w_{ij}(\delta_{ij}-d_{ij}(X))^2$$ over $n\times p$ *configuration matrices* $X$. Here $W=\{w_{ij}\}$ and $\Delta=\{\delta_{ij}\}$ are given symmetric, hollow, and non-negative matrices of, respectively, *weights* and *dissimilarities*. The $d_{ij}(X)$ are Euclidean distances between the points with coordinates in the rows of $X$. Thus $$d_{ij}^2(X):=(x_i-x_j)'(x_i-x_j)=(e_i-e_j)'XX'(e_i-e_j)=\mathbf{tr}\ X'A_{ij}X.$$ Here $e_i$ and $e_j$ are unit vectors, with all elements equal to zero except for the single element $i$ or $j$, which is equal to one. Also $A_{ij}:=(e_i-e_j)(e_i-e_j)'$. The problem of minimizing **stress** is well-studied (see, for example, @borg_groenen_05). In this paper we analyze the problem of minimizing **stress** over all configurations $X$ for which $\alpha_{ij}\leq d_{ij}(X)\leq\beta_{ij}$ for some or all pairs $i,j$, where the $\alpha_{ij}$ and $\beta_{ij}$ are given bounds satisfying $0\leq\alpha_{ij}<\beta_{ij}\leq+\infty$. We suppose the constraints are strongly consistent, i.e. there is at least one $n\times p$ configuration matrix $X$ such that $\alpha_{ij} Figure 1: Equidistance Data, Unconstrained Left, Lower Bounds Right In the second analysis, now using eight objects with equal dissimilarties, we require$d_{ij}(X)\geq 1$for all$i,j$with$|i-j|=1$, and$d_{ij}(X)\leq 2$for all$i,j$with$|i-j|=2$. In the unconstrained case we find **stress** equal to 0.0973520592 after 132 iterations, in the constrained case we find 0.1183956861 after 357 iterations. The lower bound constraints are all active, which means the distances between successive points are all one (indicated with lines in figure 2. The upper bound constraints are all inactive. Figure 2: Equidistance Data, Unconstrained Left, Lower Bounds Right So far we have assumed$\alpha_{ij}<\beta_{ij}$for all$i,j$. It is possible, however, to let$\alpha_{ij}=\beta_{ij}$, which means we require$d_{ij}(X)=\alpha_{ij}=\beta_{ij}$. Note the complete set of constraints still needs to be consistent, and the initial configuration must still be feasible. In our example we require all seven distances between successive points to be equal to one. Our algorithm in this case becomes extremely slow. It stops at the upper bound of 2000 iterations, with **stress** 0.145422896. Of course all constraints are now active. The Lagrange multipliers are large, serving as penalty parameters to force the equality constraints.  ## [1] 607.209037 798.640263 184.410120 470.471412 837.258144 684.253342 ## [7] 342.484954 607.726977 799.263569 184.916274 470.878355 838.697106 ## [13] 685.017712 342.476621  Figure 3: Equidistance Data, Distances Constrained to One Note, however, that the previous analysis where we merly required$d_{ij}(X)\geq 1$for$|i-j|=1$produced a solution feasible for the current problem with **stress** equal to 0.1183956861. This makes it interesting see what goes on if we increase the upper bound of the number of iterations even more, say, to 10000. We now find **stress** 0.1016650356 after 4638 iterations. The Lagrange multipliers for this solution are much smaller (first seven correspond with upper bounds, second seven with lower bounds). What basically happens is that for each$(i,j)$with$|i-j|=1$either the upper or the lower constraint is seen to be violated and gets a zero Lagrange multiplier. The corresponding Lagrange multiplier for the corresponding other constraint is positive, because it is interpreted as being satisfied. This is a consequence of using floating point with limited precision, and the difference between$\alpha_{ij}=0.99\cdots$and$\beta_{ij}=1.00\cdots$.  ## [1] 0.1769885825 0.0000000000 0.2256740368 0.0000000001 0.0000005800 ## [6] 0.0000000012 0.0707712910   ## [1] 0.0000094723 0.2413717195 0.0000020656 0.0233523251 0.1287322429 ## [6] 0.1247717131 0.0000000002  The successive distances are  ## [1] 0.9999999397 1.0000000032 1.0000000368 1.0000000221 0.9999999739 ## [6] 0.9999999647 0.9999999492  Figure 4: Equidistance Data, Distances Constrained to One ##Dutch Political Parties 1967 As the next illustration we use data from @degruijter_67, with average dissimilarity judgments between Dutch political parties in 1967. The data are  ## KVP PvdA VVD ARP CHU CPN PSP BP ## PvdA 5.63 ## VVD 5.27 6.72 ## ARP 4.60 5.64 5.46 ## CHU 4.80 6.22 4.97 3.20 ## CPN 7.54 5.12 8.13 7.84 7.80 ## PSP 6.73 4.59 7.55 6.73 7.08 4.08 ## BP 7.18 7.22 6.90 7.28 6.96 6.34 6.88 ## D66 6.17 5.47 4.67 6.13 6.04 7.42 6.36 7.36  First, three different but comparable anlyses are done. The first does not impose restriction, the second is *MDS from below*, which requires$d_{ij}(X)\leq\delta_{ij}$for all$i,j$. And the third is *MDS from above*, for which$d_{ij}(X)\geq\delta_{ij}$for all$i,j$. The configurations are quite similar, except for the position of D'66, which at the time was a novelty in Dutch politics. The value of **stress** at the solutions is, respectively, 0.044603386, 0.0752702106, and 0.280130691. Figure 5: De Gruijter Data, Unconstrained Solution Figure 6: De Gruijter Data, MDS from Below Figure 7: De Gruijter Data, MDS from Above The same data are used in the next analysis, which requires$2\leq d_{ij}(X)\leq 8$for all$i,j$. We converge to **stress** 0.0668519523 in 119 iterations. There are eight active constrains, with four distances equal to the lower bound and four equal to the upper bound. The configuration in figure 8 shows the distances equal to the lower bound in blue and those equal to the upper bound in red. The active constraints are also clearly visible in the Shepard plot in figure 8. Figure 8: De Gruijter Data, Distance Intervals #Appendix: Code ##updown.R r suppressPackageStartupMessages (library (mgcv, quietly = TRUE)) suppressPackageStartupMessages (library (MASS, quietly = TRUE)) suppressPackageStartupMessages (library (lsei, quietly = TRUE)) suppressPackageStartupMessages (library (numDeriv, quietly = TRUE)) source("auxilary.R") source("mdsUtils.R") source("nnnewton.R") source("qpqc.R") source("smacofUpDown.R")  ##auxilary.R r mprint <- function (x, d = 6, w = 8, f = "") { print (noquote (formatC ( x, di = d, wi = w, fo = "f", flag = f ))) } directSum <- function (x) { m <- length (x) nr <- sum (sapply (x, nrow)) nc <- sum (sapply (x, ncol)) z <- matrix (0, nr, nc) kr <- 0 kc <- 0 for (i in 1:m) { ir <- nrow (x[[i]]) ic <- ncol (x[[i]]) z[kr + (1:ir), kc + (1:ic)] <- x[[i]] kr <- kr + ir kc <- kc + ic } return (z) } repList <- function(x, n) { z <- list() for (i in 1:n) z <- c(z, list(x)) return(z) } shapeMe <- function (x) { m <- length (x) n <- (1 + sqrt (1 + 8 * m)) / 2 d <- matrix (0, n, n) k <- 1 for (i in 2:n) { for (j in 1:(i - 1)) { d[i, j] <- d[j, i] <- x[k] k <- k + 1 } } return (d) } symmetricFromTriangle <- function (x, lower = TRUE, diagonal = TRUE) { k <- length (x) if (diagonal) n <- (sqrt (1 + 8 * k) - 1) / 2 else n <- (sqrt (1 + 8 * k) + 1) / 2 if (n != as.integer (n)) stop ("input error") nn <- 1:n if (diagonal && lower) m <- outer (nn, nn, ">=") if (diagonal && (!lower)) m <- outer (nn, nn, "<=") if ((!diagonal) && lower) m <- outer (nn, nn, ">") if ((!diagonal) && (!lower)) m <- outer (nn, nn, "<") b <- matrix (0, n, n) b[m] <- x b <- b + t(b) if (diagonal) diag (b) <- diag(b) / 2 return (b) } triangleFromSymmetric <- function (x, lower = TRUE, diagonal = TRUE) { n <- ncol (x) nn <- 1:n if (diagonal && lower) m <- outer (nn, nn, ">=") if (diagonal && (!lower)) m <- outer (nn, nn, "<=") if ((!diagonal) && lower) m <- outer (nn, nn, ">") if ((!diagonal) && (!lower)) m <- outer (nn, nn, "<") return (x[m]) }  ##mdsUtils.R r library (mgcv) torgerson <- function (delta, p = 2) { z <- slanczos(-doubleCenter((delta ^ 2) / 2), p) w <- matrix (0, p, p) v <- pmax(z$values, 0) diag (w) <- sqrt (v) return(z$vectors %*% w) } basisPrep <- function (n, p, w) { m <- n * (n - 1) / 2 v <- -symmetricFromTriangle (w, diagonal = FALSE) diag (v) <- -rowSums(v) ev <- eigen (v) eval <- ev$values[1:(n - 1)] evec <- ev$vectors[, 1:(n - 1)] z <- evec %*% diag (1 / sqrt (eval)) a <- array (0, c(n - 1, n - 1, m)) k <- 1 for (j in 1:(n-1)) { for (i in (j+1):n) { dif <- z[i,] - z[j,] a [, , k] <- outer (dif, dif) k <- k + 1 } } return (list (z = z, a = a)) } center <- function (x) { return (apply (x, 2, function (z) z - mean (z))) } doubleCenter <- function (x) { n <- nrow (x) j <- diag(n) - (1 / n) return (j %*% x %*% j) } squareDist <- function (x) { d <- diag (x) return (outer (d, d, "+") - 2 * x) }  ##smacofUpDown.R r smacofUpDown <- function (delta, xini, w = rep (1, length (delta)), bndlw = rep (0, length (delta)), bndup = rep (Inf, length (delta)), itmax = 1000, eps = 1e-10, verbose = FALSE) { m <- length (delta) p <- dim (xini)[2] n <- (1 + sqrt (1 + 8 * m)) / 2 dini <- as.vector (dist (xini)) if (any (dini > bndup) || any (dini < bndlw)) stop ("initial estimate not feasible") wndup <- which (bndup < Inf) lup <- length (wndup) wndlw <- which (bndlw > 0) llw <- length (wndlw) ltt <- lup + llw r <- p * (n - 1) yold <- rep (0, ltt) h <- basisPrep (n, p, w) xold <- rep (0, r) for (s in 1:p) { k <- (s - 1) * (n - 1) + 1:(n - 1) xold[k] <- crossprod (h$z, xini[, s]) / diag (crossprod (h$z)) } d <- rep (0, m) itel <- 1 sold <- Inf ssq <- sum (w * delta ^ 2) repeat { xmid <- rep (0, r) xx <- matrix (xold, n - 1, p) t <- 1 bcomp <- matrix (0, r, ltt + 1) for (k in 1:m) { ak <- h$a[, , k] ax <- ak %*% xx d[k] <- sqrt (sum (xx * ax)) if (!is.na (match (k, wndlw))) { bcomp[, 1 + lup + t] <- -ax / d[k] t <- t + 1 } xmid <- xmid + w[k] * (delta[k] / d[k]) * ax } if (ltt > 0) { ccomp <- c(c(ssq,-bndup[wndup] ^ 2) / 2, bndlw[wndlw]) bcomp[, 1] <- -xmid acomp <- array (0, c(r, r, ltt + 1)) acomp[, , 1] <- diag (r) t <- 1 for (k in wndup) { ak <- directSum (repList (h$a[, , k], p)) acomp[, , t + 1] <- ak t <- t + 1 } v <- qpqc (yold, acomp, bcomp, ccomp, verbose = FALSE) snew <- 2 * v$f / ssq xnew <- v$xmin ynew <- v$multipliers cons <- v$constraints } else { xnew <- xmid snew <- sum (w * (delta - d) ^ 2) / ssq ynew <- NULL cons <- NULL } if (verbose) cat( "Iteration: ", formatC (itel, width = 3, format = "d"), "sold: ", formatC ( sold, digits = 8, width = 12, format = "f" ), "snew: ", formatC ( snew, digits = 8, width = 12, format = "f" ), "\n" ) if ((itel == itmax) || ((sold - snew) < eps)) break xold <- xnew sold <- snew yold <- ynew itel <- itel + 1 } xconf <- matrix (0, n, p) for (s in 1:p) { k <- (s - 1) * (n - 1) + 1:(n - 1) xconf[, s] <- h$z %*% xnew[k] } return ( list ( delta = delta, dist = d, x = xconf, multipliers = ynew, constraints = cons, itel = itel, stress = sum (w * (delta - d) ^ 2) / ssq ) ) }  #References