Note: This is a working paper which will be expanded/updated frequently. All suggestions for improvement are welcome. The directory gifi.stat.ucla.edu/third has a pdf version, the bib file, the complete Rmd file with the code chunks, and the R and C source code.

1 Introduction

The multidimensional scaling loss function fStress is defined as \[\begin{equation}\label{E:fstress} \sigma(x):=\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}(\delta_{ij}-f(x'A_{ij}x))^2 \end{equation}\]

Here \(x=\mathbf{vec}(X)\), with \(X\) the usual MDS configuration of \(n\) points in \(p\) dimensions. The distances between points \(i\) and \(j\) in the configurations are \(d_{ij}(x):=\sqrt{x'A_{ij}x}\).

The \(np\times np\) matrices \(A_{ij}\) are defined using unit vectors \(e_i\) and \(e_j\), all zero except for one element that is equal to one. If you like, the \(e_i\) are the columns of the identity matrix. Define \[ E_{ij}:=(e_i-e_j)(e_i-e_j)', \] and use \(p\) copies of \(E_{ij}\) to make the direct sum \[ A_{ij}:=\underbrace{E_{ij}\oplus\cdots\oplus E_{ij}}_{p\text{ times}}. \] Assuming without loss of generality that the dissimilarities are normalized to sum of squares one. Then \[ \sigma(x)=1-\rho(x)+\eta^2(x) \] with \[ \rho(x):=\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}\delta_{ij}f(x'A_{ij}x), \] and \[ \eta^2(x):=\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}f^2(x'A_{ij}x), \] #Partials, Partials Everywhere

1.1 Derivatives of fDistances

\[ f(x)=g(x'Ax) \] \[ \mathcal{D}_kf(x)=Dg(x'Ax)b_k \] \[ \mathcal{D}_{kl}f(x)=2\mathcal{D}g(x'Ax)a_{kl}+4\mathcal{D}^2g(x'Ax)b_kb_l \] \[ \mathcal{D}_{kl v}f(x)=4\mathcal{D}^2g(x'Ax)\{a_{kl}b_v+a_{l v}b_k+a_{kv}b_l\}+8\mathcal{D}^3g(x'Ax)b_kb_l b_v \]

\[\begin{multline*} \mathcal{D}_{kl vu}f(x)=8\mathcal{D}^3(x'Ax)\{a_{kl}b_vb_u+a_{l v}b_kb_u+a_{kv}b_lb_u+b_lb_va_{ku}+b_kb_l a_{uv}+b_kb_va_{lu}\}+\\4\mathcal{D}^2g(x'Ax)\{a_{kl}a_{vu}+a_{lv}a_{ku}+a_{kv}a_{l u}\}+16\mathcal{D}^4g(x'Ax)b_kb_lb_vb_u \end{multline*}\]

1.2 Derivatives of Stress

The stress loss function is \[ \sigma(x):=\frac12\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}(\delta_{ij}-d_{ij}(x))^2, \] Here \(x=\mathbf{vec}(X)\), with \(X\) the usual MDS configuration of \(n\) points in \(p\) dimensions. The distances between points \(i\) and \(j\) in the configurations are \(d_{ij}(x):=\sqrt{x'A_{ij}x}\).

The \(np\times np\) matrices \(A_{ij}\) are defined using unit vectors \(e_i\) and \(e_j\), all zero except for one element that is equal to one. If you like, the \(e_i\) are the columns of the identity matrix. Define \[ E_{ij}:=(e_i-e_j)(e_i-e_j)', \] and use \(p\) copies of \(E_{ij}\) to make the direct sum \[ A_{ij}:=\underbrace{E_{ij}\oplus\cdots\oplus E_{ij}}_{p\text{ times}}. \] Assuming without loss of generality that the dissimilarities are normalized to sum of squares one. Then \[ \sigma(x)=1-\rho(x)+\frac12x'Vx, \] with \[ \rho(x):=\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}\delta_{ij}\sqrt{x'A_{ij}x}, \] and \[ V:=\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}A_{ij}. \] The function \(\frac12x'Vx\) is quadratic, so its derivatives are trivial to compute. \[ \rho(x+y)=\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}\delta_{ij}\sqrt{e_{ij}(x+y)} \]

\[ \rho(x+y)=\rho(x)+y'\mathcal{D}\rho(x)+\frac12 y'\mathcal{D}^2(x)y+ \]

\[ \mathcal{D}\rho(x)=\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}\frac{\delta_{ij}}{d_{ij}(x)}A_{ij}x \]

\[ \mathcal{D}^2\rho(x)=\mathop{\sum\sum}_{1\leq i<j\leq n}w_{ij}\delta_{ij} \]