--- title: "Quadratic Programming with Quadratic Constraints" author: "Jan de Leeuw" date: "Version 14, January 03, 2017" output: html_document: keep_md: yes number_sections: yes toc: yes toc_depth: 3 pdf_document: keep_tex: yes number_sections: yes toc: yes toc_depth: 3 fontsize: 12pt graphics: yes bibliography: dual.bib abstract: We give a quick and dirty, but reasonably safe, algorithm for the minimization of a convex quadratic function under convex quadratic constraints. The algorithm minimizes the Lagrangian dual by using a safeguarded Newton method with non-negativity constraints. --- {r function_code, echo = FALSE} source("dual.R")  {r packages, echo = FALSE} options (digits = 10) library (captioner)  {r equation_numbering, echo = FALSE} figure_nums <- captioner (prefix = "Figure") figure_nums (name = "vegetables", caption = "Guilford Vegetable Data", display = FALSE)  Note: This is a working paper which will be expanded/updated frequently. All suggestions for improvement are welcome. The directory [gifi.stat.ucla.edu/dual](http://gifi.stat.ucla.edu/dual) has a pdf version, the complete Rmd file with all code chunks, the bib file, and the R source code. #Introduction In this note we give an algorithm for minimizing a quadratic $f_0$ over all $x$ for which $f_s(x)\leq 0$, where the $f_s$ with $s=1,\cdots,p$ are also quadratics. We assume the $f_s$ are convex quadratics for all $s=0,\cdots,p$. Thus we have $f_s(x)=c_s+b_s'x+\frac12 x'A_sx$, with $A_s$ positive semi-definite for all $s$. In addition we assume, without loss of generality, that $A_0$ is non-singular (and thus positive definite). Finally, we assue the Slater condition is satisfied, i.e. there is an $x$ such that $f_s(x)<0$ for $s=1,\cdots,p$. Quadratic programming with quadratic constraints (QPQC) has been studied in great detail, both for the convex and the much more complicated non-convex case. The preferred algorithm translates the problem into a second-order cone programming (SOCP) problem, which is then solved by interior point methods. This is the method used in the R packages DWD (@huang_haaland_lu_liu_marron_11), CLSOCP (@rudy_11), and cccp (@pfaff_15). We go a somewhat different route, more direct, but undoubtedly less efficient. In the appendix we collect some general expressions for the first and second derivatives of a marginal function and apply them to the Lagrangian of a constrained optimization problem. In the body of the paper we apply these formulas to the Langrangian dual of the QPQC problem, using a straightforward version of the Newton method with non-negativity constraints. #QPQC Suppose the $f_s$ are convex quadratics for all $s=0,\cdots,p$. We have $f_s(x)=c_s+b_s'x+\frac12 x'A_sx$, with $A_s$ positive semi-definite for all $s$. In addition we assume, without loss of generality, that $A_0$ is non-singular (and thus positive definite). It follows that the Lagrangian is $$g(x,y)=(c_0+\sum_{s=1}^py_sc_s)+(b_0+\sum_{s=1}^py_sb_s)'x+\frac12 x'\left\{A+\sum_{s=1}^py_sA_s\right\}x,$$ and thus $$\label{E:qxy} x(y)=-\left\{A_0+\sum_{s=1}^py_sA_s\right\}^{-1}(b_0+\sum_{s=1}^py_sb_s),$$ and $$\label{E:qhxy} h(y)=(c_0+\sum_{s=1}^py_sc_s)-\frac12(b_0+\sum_{s=1}^py_sb_s)'\left\{A_0+\sum_{s=1}^py_sA_s\right\}^{-1}(b_0+\sum_{s=1}^py_sb_s).$$ We could use the explicit formula $\eqref{E:qhxy}$ to compute derivatives of $h$, but instead we apply the results of the appendix. In the QPQC case we have, from $\eqref{E:ldh}$, $$\mathcal{D}h(y)=F(x(y))=\begin{bmatrix}f_1(x(y))\\\vdots\\f_p(x(y))\end{bmatrix}.$$ For the second derivatives we need $$\mathcal{D}_{11}g(x,y)=A_0+\sum_{s=1}^p y_sA_s,$$ and $$\mathcal{D}F(x)=\begin{bmatrix}b_1'+x'A_1\\\vdots\\b_p'+x'A_p\end{bmatrix}.$$ Thus, from $\eqref{E:ld2h}$, $$\{\mathcal{D}^2h(y)\}_{st}=-(A_sx(y)+b_s)'\left\{A_0+\sum_{s=1}^p y_sA_s\right\}^{-1}(A_tx(y)+b_t)$$ We now know how to compute first and second derivatives of the dual function. In step $k$ of the iterative process we use the quadratic Taylor approximation at the current value $y^{(k)}$ of $y$. $$h(y)=h(y^{(k)})+(y-y^{(k)})'\mathcal{D}h(y^{(k)})+\frac12(y-y^{(k)})'\mathcal{D}^2(y^{(k)})(y-y^{(k)})$$ We then maximize this over $y\geq 0$ the find $\overline{y}^{(k)}$, using the pnnqp function from the lsei package (@wang_lawson_hanson_15). Finally we stabilize Newton's method by computing $$y^{(k+1)}=\mathop{\mathbf{argmin}}_{0\leq\lambda\leq 1}\ h(y^{(k)}+\lambda(\overline{y}^{(k)}-y^{(k)})).$$ For this step-size or relaxation computation we use optimize from base R. But first we check if $(\overline{y}^{(k)}-y^{(k)})'\mathcal{D}h(\overline{y}^{(k)})\geq 0$, in which case we choose step size equal to one. #Example We give a small and artificial example. To make sure the Slater condition is satisfied we choose $c_s<0$ for $s=1,\cdots,p$, which guarantees that $f_s(0)<0$. The example has quadratics of three variables and five quadratic constraints. {r example_setup_1} set.seed(54321) a <- array (0, c(3, 3, 6)) for (j in 1:6) { a[, , j] <- crossprod (matrix (rnorm (300), 100, 3)) / 100 } b <- matrix (rnorm (18), 3, 6) c <- -abs (rnorm (6))  {r computation_1} print (qpqc (c(0, 0, 0, 0, 0), a, b, c, eps = 1e-10, verbose = TRUE))  #Appendix: Derivatives In this appendix we collect some formulas for first and second derivatives of minima and minimizers in constrained problems. It is just convenient to have them in a single place, and we can apply them in the body of the paper. ##Marginal Function Suppose $g:\mathbb{R}^n\otimes\mathbb{R}^m\rightarrow\mathbb{R}$, and define $$\label{E:h} h(y):=\min_x g(x,y)=g(x(y),y),$$ with $$\label{E:xy} x(y):=\mathop{\mathbf{argmin}}_x g(x,y),$$ which we assume to be unique for each $y$. Now assume two times continuous differentiability of $g$. Then $x(y)$ satisfies $$\mathcal{D}_1g(x(y),y)=0,$$ and thus $$\mathcal{D}_{11}g(x(y),y)\mathcal{D}x(y)+\mathcal{D}_{12}g(x(y),y)=0.$$ It follows that $$\label{E:dxy} \mathcal{D}x(y)=-\mathcal{D}_{11}^{-1}g(x(y),y)\mathcal{D}_{12}g(x(y),y).$$ Now $$\label{E:dh} \mathcal{D}h(y)=\mathcal{D}_1g(x(y),y)\mathcal{D}x(y)+\mathcal{D}_2g(x(y),y)=\mathcal{D}_2g(x(y),y),$$ and \begin{align} \mathcal{D}^2h(y)&=\mathcal{D}_{12}g(x(y),y)\mathcal{D}x(y)+\mathcal{D}_{22}g(x(y),y)=\notag\\&=\mathcal{D}_{22}g(x(y),y)-\mathcal{D}_{21}g(x(y),y)\mathcal{D}_{11}^{-1}g(x(y),y)\mathcal{D}_{12}g(x(y),y).\label{E:d2h} \end{align} ##Lagrangians We specialize the results of the previous section to Langrangian duals. For the problem of minimizing $f:\mathbb{R}^n\rightarrow\mathbb{R}$ over $x$ such that $f_s(x)\leq 0$ for $s=1,\cdots,p$, where $f_s:\mathbb{R}^n\rightarrow\mathbb{R}$. We can also write the inequality constraints compactly as $F(x)\leq 0$, where $F:\mathbb{R}^n\otimes\mathbb{R}^m\rightarrow\mathbb{R}$. The Lagrangian is $$\label{E:lh} g(x,y)=f_0(x)+y'F(x).$$ The primal problem is $$\min_x\max_{y\geq 0}g(x,y)=\min_x\begin{cases}f_0(x)&\text{ if }F(x)\leq 0,\\+\infty&\text{ otherwise}\end{cases}=\min_x \{f_0(x)\mid F(x)\leq 0\}.$$ The dual problem is $$\max_{y\geq 0}\min_x g(x,y)=\max_{y\geq 0} h(y).$$ For any feasible $x$ and $y$ we have weak duality $h(y)\leq f_0(x)$. If the $f_s$ for $s=0,\cdots,p$ are convex and the Slater condition is satisfied we have equal optimal values for the primal and dual problems. Moreover if $y$ solves the dual problen then $x(y)$ solves the primal problem. Thus, from $\eqref{E:dh}$, $$\label{E:ldh} \mathcal{D}h(y)=\mathcal{D}_2g(x(y),y)=F(x(y)),$$ and, from $\eqref{E:d2h}$, $$\mathcal{D}^2h(y)=-\mathcal{D}_{21}g(x(y),y)\mathcal{D}_{11}^{-1}g(x(y),y)\mathcal{D}_{12}g(x(y),y).$$ Also $$\mathcal{D}_{11}g(x(y),y)=\mathcal{D}^2f_0(x(y))+\sum_{s=1}^p y_s\mathcal{D}^2 f_s(x(y)),$$ and $$\mathcal{D}_{21}g(x(y),y)=\mathcal{D}F(x(y)),$$ so that $$\label{E:ld2h} \mathcal{D}^2h(y)=-\mathcal{D}F(x(y))\left\{\mathcal{D}^2f_0(x(y))+\sum_{s=1}^p y_s\mathcal{D}^2 f_s(x(y))\right\}^{-1}(\mathcal{D}F(x(y)))'$$ #Appendix: Code ##dual.R {r file_auxilary, code = readLines("dual.R")}  ##nnnewton.R {r file_auxilary, code = readLines("nnnewton.R")}  ##qpqc.R {r file_auxilary, code = readLines("qpqc.R")}  #References