\documentclass[12pt,]{article}
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\usepackage{fixltx2e} % provides \textsubscript
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\else % if luatex or xelatex
\ifxetex
\usepackage{mathspec}
\usepackage{xltxtra,xunicode}
\else
\usepackage{fontspec}
\fi
\defaultfontfeatures{Mapping=tex-text,Scale=MatchLowercase}
\newcommand{\euro}{€}
\fi
% use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
% use microtype if available
\IfFileExists{microtype.sty}{%
\usepackage{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\usepackage[margin=1in]{geometry}
\ifxetex
\usepackage[setpagesize=false, % page size defined by xetex
unicode=false, % unicode breaks when used with xetex
xetex]{hyperref}
\else
\usepackage[unicode=true]{hyperref}
\fi
\hypersetup{breaklinks=true,
bookmarks=true,
pdfauthor={Jan de Leeuw, Patrick Groenen, Patrick Mair},
pdftitle={Singularities and Zero Distances in Multidimensional Scaling},
colorlinks=true,
citecolor=blue,
urlcolor=blue,
linkcolor=magenta,
pdfborder={0 0 0}}
\urlstyle{same} % don't use monospace font for urls
\usepackage{graphicx,grffile}
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
\def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{5}
%%% Use protect on footnotes to avoid problems with footnotes in titles
\let\rmarkdownfootnote\footnote%
\def\footnote{\protect\rmarkdownfootnote}
%%% Change title format to be more compact
\usepackage{titling}
% Create subtitle command for use in maketitle
\newcommand{\subtitle}[1]{
\posttitle{
\begin{center}\large#1\end{center}
}
}
\setlength{\droptitle}{-2em}
\title{Singularities and Zero Distances in Multidimensional Scaling}
\pretitle{\vspace{\droptitle}\centering\huge}
\posttitle{\par}
\author{Jan de Leeuw, Patrick Groenen, Patrick Mair}
\preauthor{\centering\large\emph}
\postauthor{\par}
\predate{\centering\large\emph}
\postdate{\par}
\date{Version 01, Februari 15, 2016}
% Redefines (sub)paragraphs to behave more like sections
\ifx\paragraph\undefined\else
\let\oldparagraph\paragraph
\renewcommand{\paragraph}[1]{\oldparagraph{#1}\mbox{}}
\fi
\ifx\subparagraph\undefined\else
\let\oldsubparagraph\subparagraph
\renewcommand{\subparagraph}[1]{\oldsubparagraph{#1}\mbox{}}
\fi
\begin{document}
\maketitle
{
\hypersetup{linkcolor=black}
\setcounter{tocdepth}{3}
\tableofcontents
}
Note: This is a working paper which will be expanded/updated frequently.
The directory
\href{http://gifi.stat.ucla.edu/zeroes}{gifi.stat.ucla.edu/zeroes} has a
pdf copy of this article, the complete Rmd file with all code chunks,
the bib file, and the R source code.
\section{Problem}\label{problem}
Multidimensional scaling (MDS) in this paper is minimization of the loss
function \emph{stress}, introduced by Kruskal (1964a), Kruskal (1964b).
Stress is defined as
\begin{equation}
\sigma(X):=\mathop{\sum\sum}_{1\leq i0,\\
0&\text{ if }d_{ij}(X)=0,
\end{cases}\\
b_{ii}(X)&:=-\sum_{1\leq j\not= i\leq n} b_{ij}(X)
\end{align*}
An important result on zero distances at a local minimum was proved by
De Leeuw (1984). It is based on a formula for the Dini directional
derivative (see also De Leeuw, Groenen, and Mair (2016)).
\textbf{Theorem 1: {[}Directional Derivative{]}} \[
\lim_{\epsilon\downarrow 0}\frac{\sigma(X+\epsilon Y)-\sigma(X)}{\epsilon}=\mathbf{tr}\ Y'(V-B(X))X-\sum\{w_{ij}\delta_{ij}d_{ij}(Y)\mid d_{ij}(X)=0\}
\]
\textbf{Proof:} Use \[
d_{ij}(X+\epsilon Y)=d_{ij}(X)+\epsilon\begin{cases}\frac{1}{d_{ij}(X)}\mathbf{tr}\ X'A_{ij}Y&\text{ if }d_{ij}(X)>0,\\
d_{ij}(Y)&\text{ if }d_{ij}(X)=0.
\end{cases}
\] to get the required result. \textbf{QED}
\textbf{Theorem 2: {[}Necessary Condition{]}} If \(\sigma\) has a local
minimum at \(X\) then \((V-B(X))X=0\) and \(d_{ij}(X)>0\) for all
\(i\not= j\) such that \(w_{ij}\delta_{ij}>0\).
\textbf{Proof:} Direct from theorem 1. \textbf{QED}
If \(\sigma\) has a local minimum at \(X\) and is differentiable at
\(X\), then the derivative vanishes. In the usual MDS problems we have
\(w_{ij}\delta_{ij}>0\) for all \(i\not= j\), and theorem 2 says that
stress is differentiable at local minima.
Points where the derivative exists and vanishes are stationary points,
which are not necessarily local minima. Because of the possibility of
zero distances we need a more general notion of a stationary point,
which we now define as points where \(0\in\partial\sigma(X)\), with
\(\partial\sigma(X)\) the subdifferential of \(\sigma\) at \(X\). We
know (De Leeuw (1977)) that \(\sigma\) is the difference of a convex
quadratic \(\eta:=\mathbf{tr}\ X'VX\) and a convex function
\(\rho(X):+\mathbf{tr}\ X'B(X)X\), and thus the subdifferential is
simply a translation of the convex subdifferential, or
\(\partial\sigma(X)=2(VX-\partial\rho(X))\).
\textbf{Theorem 3: {[}Stationary Points{]}} We have
\(0\in\partial\sigma(X)\) if and only if for all \(i0,\\
\left\{(e_i-e_j)u'\mid u'u\leq 1\right\}&\text{ if }d_{ij}(X)=0.
\end{cases}
\] \textbf{QED}
Note that theorem 3 implies that always
\((V-B(X))X\in\partial\sigma(X)\), whether there are zero distances or
not.
\section{Zero Distances}\label{zero-distances}
Theorem 2 tells us that at local minima \(w_{ij}\delta_{ij}=0\) if
\(d_{ij}(X)=0\). But this leaves open some interesting questions about
stationary points and local minima that are not covered by this result.
Can there be local minima with zero distances ? If we allow for zero
dissimilarities the answer is clearly yes. Take \(\Delta\) equal to
\(D(X)\) where the configuration \(X\) has at least one zero distance.
Then \(X\) gives the global minimum of stress, equal to zero. And a
global minimum is, of course, also a local minimum. If we allow for zero
weights, then the answer is again yes, even if all dissimilarties are
positive. Suppose we have four objects with all dissimilarities 1. We
are fitting two-dimensional configurations. If all weights are one, the
global minimum is attained for four points in the corners of a square.
If we set \(w_{12}=0\), however, we get perfect fit by putting points in
the corners of an equilateral triangle, with objects 1 and 2 together in
one corner. Since this makes stress zero, it is the global minimum. If
we set, in addition, \(w_{34}=0\) then we get the global minimum by
having two points on the line, one with objects 1 and 2, and one with
objects 3 and 4.
A related question is if there can there be solutions to \((V-B(X))X\)
with zero distances ? We answer this more systematically. Suppose \(X\)
satisfies the stationary equations \((V-B(X))X=0\) and \(X\) is of block
form, with all rows within a block the same, and rows in different
blocks different. If there are \(m\) blocks, then we can write
\(X=G\overline{X}\), where \(G\) is an \(n\times m\) binary indicator
matrix, indicating block membership. Suppose \(I_j\) are the indices of
the objects in block \(j\).
It follows that \((V-B(X))G\overline{X}=0\) and thus
\((\overline{V}-\overline{B}(X))\overline{X}=0\), where
\(\overline{V}:=G'VG\) and \(\overline{B}(X):=G'B(X)X\). We can write,
somewhat suggestively, \(\overline{B}(X)=B(\overline X)\). Use the fact
that all between-set distances in \(B_{j\ell}(X)\) are equal to
\(d_{j\ell}(\overline{X})\). Thus if, for \(j\not=\ell\)
\begin{align}
\overline{w}_{j\ell}&:=-\sum_{i\in I_j}\sum_{k\in I_\ell}w_{ik},\label{E:wcls}\\
\overline{b}_{j\ell}(X)&:=-\frac{1}{d_{j\ell}(\overline{X})}\sum_{i\in I_j}\sum_{k\in I_\ell}w_{ik}\delta_{ik},\label{E:bcls}
\end{align}
and the diagonal elements are filled in the make the matrices
doubly-centered.
So the stationary equations for an MDS problem with zero distances have
a \emph{reduced} or \emph{clustered} set of stationary equations, of the
order of the number of \emph{clusters} of points, which do not involve
zero distance. This reduced MDS problem can also be interpreted as the
original MDS problem, but with the constraints that there is some fixed
clustering of points.
Conversely, if we have stationary equations for an MDS problem with
nono-zero weights and distances, then we can always expand them to
stationary equations for a problem with zero distances, simply by
finding \(W\) and \(\Delta\) that satisfy \(\eqref{E:wcls}\) and
\(\eqref{E:bcls}\).
\section{Singularities}\label{singularities}
Suppose \((V-B(X))X=0\) and the \(n\times p\) matrix \(X\) has rank
\(r