There are only two $B$-splines of order 3 on these knots, and the basic interval is empty.

```{r lowthird, echo = FALSE, fig.align = "center"} y <- QbSpline (x, knots, k = 1) plot (x, y, type = "l", col = "RED", lwd = 3, ylim = c(0,1)) lines (x, QbSpline (x, knots, k = 2), col = "BLUE", lwd = 3) ```

The programs also work for multiple knots. Consider the example from @deboor_01, page 92. The knots are `r c(0.0, 1.0, 1.0, 3.0, 4.0, 6.0, 6.0, 6.0)`, and the order is three. The basic interval is $[1,6]$.

```{r lowthirdmult, echo = FALSE, fig.align = "center"} knots<-c(0.0, 1.0, 1.0, 3.0, 4.0, 6.0, 6.0, 6.0) x<- seq(0,6,length=1000) y <- QbSpline (x, knots, k = 1) plot (x, y, type = "l", col = "RED", lwd = 3, ylim = c(0,1)) lines (x, QbSpline (x, knots, k = 2), col = "BLUE", lwd = 3) lines (x, QbSpline (x, knots, k = 3), col = "GREEN", lwd = 3) lines (x, QbSpline (x, knots, k = 4), col = "ORANGE", lwd = 3) lines (x, QbSpline (x, knots, k = 5), col = "BROWN", lwd = 3) ```

In @deboor_01, p 89, we find another example in which there are only two distinct knots, an infinite sequence of zeroes, followed by an infinite sequence of ones. In this case there are only $m$ different $B$-splines, restriction to $[0,1]$ of the familiar polynomials $$ B_{j,m-1}(x)=\binom{m-1}{j}x^{m-1-j}(1-x)^{j} $$ for $j=0,\cdots,m-1$.

```{r lowthirdmultb, echo = FALSE, fig.align = "center"} knots<-c(rep(0,10),rep(1,10)) x<- seq(0,1,length=1000) y <- QbSpline (x, knots, k = 1) plot (x, y, type = "l", col = "RED", lwd = 2, ylim = c(0,1)) for (i in 2:17) { lines (x, QbSpline (x, knots, k = i), col = "RED", lwd = 2) } ```

##Using GSL The GNU Scientific Library (@gsl_manual_16) has $B$-spline code. The function `gslSpline()` in R calls the compiled `gslSpline.c`, which is linked with the relevant code from `libgsl.dylib`. We use the Ramsay example again, this time with $m=3$ copies of the lower and upper bounds of the interval, which makes the basic interval $[0,1]$. ```{r with_gsl_comp, fig.align = "center"} knots <- c(0,.3,.5,.6,1) order <- 3 x<-seq(0,1,length = 1001) h <- matrix (0, 1001, 6) for (i in 1:1001) h[i,] <- gslSpline (x[i], order, knots) ```

```{r with_gsl_plot, fig.align = "center", echo = FALSE} plot(x, h[,1], type = "l", col = "RED", lwd = 3, ylab = "B-spline") for (j in 2:ncol(h)) lines (x, h[,j], col = "RED", lwd = 3) ```

##Using Recursion We have previously published spline function code, using an R interface to C code, in @deleeuw_E_15d. That code translated the Fortran code in an unpublished note by @sinha to C. There are some limitations associated with this implementation. First, it is limited to extended partitions with simple inner knots. Second, the function to compute $B$-spline values recursively calls itself, using the basic relation $\eqref{E:Nspline}$, which is computationally not necessarily the best way to go. ```{r with_sinha_comp, fig.align = "center"} innerknots <- c( .3, .5, .6) degree <- 2 lowknot <- 0 highknot <- 1 x<-seq(0,1,length = 1001) h <- sinhaBasis (x, degree, innerknots, lowknot, highknot) ```

```{r with_sinha_plot, fig.align = "center", echo = FALSE} plot(x, h[,1], type = "l", col = "RED", lwd = 3, ylab = "B-spline") for (j in 2:ncol(h)) lines (x, h[,j], col = "RED", lwd = 3) ```

##De Boor In this paper we implement the basic BSPLVB algorithm from @deboor_01, page 111, for normalized $B$-splines. There are two auxilary routines, one to create the extended partition, and one that uses bisection to locate the knot interval in which a particular value is located (@schumaker_07, p 191). The function `bsplineBasis()` takes an arbitrary knot sequence. It can be combined with `extendedPartition()`, which uses inner knots and boundary points to create the extended partion. ##Illustrations For our example, which is the same as the one from figure 1 in @ramsay_88, we choose $a=0$, $b=1$, with simple interior knots `r c(.3,.5,.6)`. First the step functions, which have order 1.

```{r order1mult1, fig.align = "center"} innerknots <- c(.3,.5,.6) multiplicities <- c(1,1,1) order <- 1 lowend <- 0 highend <- 1 x <- seq (1e-6, 1-1e-6, length = 1000) knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) k <- ncol (h) par (mfrow=c(2,2)) for (j in 1:k) { ylab <- paste("B-spline", formatC(j, digits = 1, width = 2, format = "d")) plot (x, h[, j], type="l", col = "RED", lwd = 3, ylab = ylab, ylim = c(0,1)) } ```

Now the hat functions, which have order 2, again with simple knots.

```{r order2mult1,fig.align = "center"} multiplicities <- c(1,1,1) order <- 2 knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) k <- ncol (h) par (mfrow=c(2,3)) for (j in 1:k) { ylab <- paste("B-spline", formatC(j, digits = 1, width = 2, format = "d")) plot (x, h[, j], type="l", col = "RED", lwd = 3, ylab = ylab, ylim = c(0,1)) } ```

Next piecewise quadratics, with simple knots, which implies continuous differentiability at the knots. This are the N-splines corresponding with the M-splines in figure 1 of @ramsay_88.

```{r order3mult1, fig.align = "center"} multiplicities <- c(1,1,1) order <- 3 knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) k <- ncol (h) par (mfrow=c(2,3)) for (j in 1:k) { ylab <- paste("B-spline", formatC(j, digits = 1, width = 2, format = "d")) plot (x, h[, j], type="l", col = "RED", lwd = 3, ylab = ylab, ylim = c(0,1)) } ```

If we change the multiplicities to `r c(1,2,3)`, then we lose some of the smoothness.

```{r order3mult2, fig.align = "center"} multiplicities <- c(1,2,3) order <- 3 knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) k <- ncol (h) par (mfrow=c(3,3)) for (j in 1:k) { ylab <- paste("B-spline", formatC(j, digits = 1, width = 2, format = "d")) plot (x, h[, j], type="l", col = "RED", lwd = 3, ylab = ylab, ylim = c(0,1)) } ```

#Monotone Splines ##I-splines There are several ways to restrict splines to be monotone increasing. Since $B$-splines are non-negative, the definite integral of a $B$-spline of order $m$ from the beginning of the interval to a value $x$ in the interval is an increasing spline of order $m+1$. Integrated $B$-splines are known as *I-splines* (@ramsay_88). Non-negative linear combinations $I$-splines can be used as a basis for the convex cone of increasing splines. Note, however, that if we use an extended partition, then all $I$-splines start at value zero and end at value one, which means their convex combinations are the splines that are also probability distributions on the interval. To get a basis for the increasing splines we need to add the constant function to the $I$-splines and allow it to enter the linear combination with either sign. ###Low Order I-splines Straightforward integration, and using $\eqref{E:NM}$, gives some explicit formulas. If we integrate the step functions we get the piecewise linear $I$-splines. $$ \int_{-\infty}^x M_{j,1}(t)dt=\begin{cases} 0&\text{ if }x\leq t_j,\\ \frac{x-t_j}{t_{j+1}-t_j}&\text{ if }t_j\leq x\leq t_{j+1},\\ 1&\text{ if }x\geq t_{j+1}. \end{cases} $$ And if we integrate the piecewise linear $B$-splines of order 1 we get piecewise quadratic $I$-splines. $$ \int_{-\infty}^x M_{j,2}(t)dt=\begin{cases} 0&\text{ if }x\leq t_j,\\ \frac{(x-t_j)^2}{(t_{j+1}-t_j)(t_{j+2}- t_j)}&\text{ if }t_j\leq x\leq t_{j+1},\\ \frac{x-t_j}{t_{j+2}-t_j}+\frac{(x-t_{j+1})(t_{j+2}-x)}{(t_{j+2}-t_j)(t_{j+2}-t_{j+1})} &\text{ if }t_{j+1}\leq x\leq t_{j+2},\\ 1&\text{ if }x\geq t_{j+2}. \end{cases} $$ Both sets of $I$-splines are plotted in the next two figures, using R functions from `lowSpline.R`.

```{r ilowfirst, echo = FALSE, fig.align = "center"} knots <- c(0.0, 0.3, 0.5, 0.6, 1.0) x<-seq(0,1,length=1000) y <- IZbSpline (x, knots, k = 1) plot (x, y, type = "l", col = "RED", lwd = 3, ylim = c(0,1)) lines (x, IZbSpline (x, knots, k = 2), col = "BLUE", lwd = 3) lines (x, IZbSpline (x, knots, k = 3), col = "GREEN", lwd = 3) lines (x, IZbSpline (x, knots, k = 4), col = "ORANGE", lwd = 3) ```

```{r ilowsecond, echo = FALSE, fig.align = "center"} knots <- c(0.0, 0.3, 0.5, 0.6, 1.0) x<-seq(0,1,length=1000) y <- ILbSpline (x, knots, k = 1) plot (x, y, type = "l", col = "RED", lwd = 3, ylim = c(0,1)) lines (x, ILbSpline (x, knots, k = 2), col = "BLUE", lwd = 3) lines (x, ILbSpline (x, knots, k = 3), col = "GREEN", lwd = 3) ```

###General Case Integrals of I-splines are most economically computed by using the formula first given by @gaffney_76. If $\ell$ is defined by $t_{j+\ell-1}\leq x

Now we integrate the hat functions, which have order 2, again with simple knots, to find piecewise quadratic I-splines of order 3. These are the functions in the example of @ramsay_88.

```{r Iorder2mult1, echo = FALSE, fig.align = "center"} multiplicities <- c(1,1,1) order <- 3 knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) h<-rowSums(h)-t(apply(h, 1, cumsum)) k <- ncol (h) par (mfrow=c(2,3)) for (j in 1:(k-1)) { ylab <- paste("$B$-spline", formatC(j, digits = 1, width = 2, format = "d")) plot (x, h[, j], type="l", col = "RED", lwd = 3, ylab = ylab) } ```

Finally, we change the multiplicities to `r c(1,2,3)`, and compute the corresponding piecewise quadratic I-splines.

```{r Iorder3mult2, echo = FALSE, fig.align = "center"} multiplicities <- c(1,2,3) order <- 3 knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) h<-rowSums(h)-t(apply(h, 1, cumsum)) k <- ncol (h) par (mfrow=c(3,3)) for (j in 1:(k-1)) { ylab <- paste("$B$-spline", formatC(j, digits = 1, width = 2, format = "d")) plot (x, h[, j], type="l", col = "RED", lwd = 3, ylab = ylab) } ```

#Time Series Example Our first example smoothes a time series by fitting a spline. We use the number of births in New York from 1946 to 1959 (on an unknown scale), from Rob Hyndman's time series archive at http://robjhyndman.com/tsdldata/data/nybirths.dat. ```{r births_data, echo = FALSE} births <- scan ("http://robjhyndman.com/tsdldata/data/nybirths.dat") ``` ##B-splines First we fit $B$-splines of order 3. The basis matrix uses $x$ equal to $1:168$, with inner knots `r 12 * 1:13`, and interval $[1,168]$. ```{r births_basis} innerknots <- 12 * 1:13 multiplicities <- rep(1,13) lowend <- 1 highend <- 168 order <- 3 x <- 1:168 knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) u <- lm.fit(h, births) res <- sum ((births - h%*%u$coefficients)^2) ```

```{r birth-bspline-plot, fig.align = "center", echo = FALSE} birthsts <- ts (cbind (births, h%*%u$coefficients), frequency = 12, start = c(1946, 1)) plot (birthsts, lwd = 2, plot.type = "single", col = c("RED", "BLUE")) ```

The residual sum of squares is `r res`. ##I-splines We now fit the $I$-spline using the $B$-spline basis. Compute $Z=HS$ using `cumsum()`, and then $\overline y$ and $\overline Z$ by centering (substracting the column means). The formula is $$ \min_{\alpha\geq 0,\gamma}\mathbf{SSQ}\ (y-Z\alpha-\gamma e_n)=\min_{\alpha\geq 0}\mathbf{SSQ}\ (\overline y-\overline Z\alpha). $$ We use `pnnls()` from @wang_lawson_hanson_15. ```{r births_isplines_comp} knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) g <- rowSums(h) - t(apply (h, 1, cumsum)) g <- cbind (1, g) u <- pnnls (g, births, 1)$x v <- g%*%u ```

```{r births_isplines_plot, fig.align = "center", echo = FALSE} birthsts <- ts (cbind (births, v), frequency = 12, start = c(1946, 1)) plot (birthsts, lwd = 2, plot.type = "single", col = c("RED", "BLUE")) ```

The residual sum of squares is `r sum ((births - v)^2)`. ##B-Splines with monotone weights Just to make sure, we also solve the problem $$ \min_{\beta_1\leq\beta_2\leq\cdots\leq\beta_p}\mathbf{SSQ}(y-X\beta), $$ which should give the same solution, and the same loss function value, because it is just another way to fit $I$-splines. We use the `lsi()` function from @wang_lawson_hanson_15. ```{r births_isplines_values} knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) nb <- ncol (h) d <- matrix(0,nb-1,nb) diag(d)=-1 d[outer(1:(nb-1),1:nb,function(i,j) (j - i) == 1)]<-1 u<-lsi(h,births,e=d,f=rep(0,nb-1)) v <- h %*% u ``` ```{r births_isplines_values_plot, fig.align = "center", echo = FALSE} birthsts <- ts (cbind (births, v), frequency = 12, start = c(1946, 1)) plot (birthsts, lwd = 2, plot.type = "single", col = c("RED", "BLUE")) ```

The residual sum of squares is `r sum ((births - v)^2)`, indeed the same as before. ##B-Splines with monotone values Finally we solve $$ \min_{x_1'\beta\leq\cdots\leq x_n'\beta} \mathbf{SSQ}\ (y-X\beta) $$ using `mregnnM()` from @deleeuw_E_15d, which solves the dual problem using `nnls()` from @wang_lawson_hanson_15. ```{r births_bsplines_monotonic_comp} knots <- extendPartition (innerknots, multiplicities, order, lowend, highend)$knots h <- bsplineBasis (x, knots, order) u <- mregnnM(h, births) ```

```{r births_bsplines_monotonic_plot,fig.align = "center", echo = FALSE} birthsts <- ts (cbind (births, u$xb), frequency = 12, start = c(1946, 1)) plot (birthsts, lwd = 2, plot.type = "single", col = c("RED", "BLUE")) ```

The residual sum of squares is `r u$f`, which is indeed smaller than the $I$-splines value, although only very slightly so. #Regression Example We also analyze a regression example, using the Neumann data that have been analyzed previously in @gifi_B_90, pages 370-376, and in @deleeuw_mair_A_09a,pages 16-17. The predictors are temperature and pressure, the outcome variable is density. We use a step function monotone spline for temperature and a piecewise quadratic monotone spline for pressure. The lest squares problem is to fit $$ \mathbf{SSQ}(y-(\gamma e_n+H_1\alpha_1+H_2\alpha_2)), $$ where $H_1$ and $H_2$ are $I$-spline bases and $\alpha_1$ and $\alpha_2$ are non-negative. We do not transform the outcome variable density, to keep things relatively simple. ```{r regression_example} data(neumann, package = "homals") #lm.fit (cbind(1,neumann[,1], neumann[,2]), neumann[,3]) knots1 <- c(0, 20, 40, 60, 80, 100, 120, 140) order1 <- 1 knots2 <- c(0,0,0,100,200,300,400,500,600,600,600) order2 <- 3 h1 <- isplineBasis(200-neumann[,1], knots1, order1) h2 <- isplineBasis(neumann[,2], knots2, order2) g <- cbind (1, h1, h2) u <- pnnls (g, neumann[,3], 1)$x u1 <- u[1+1:ncol(h1)] u2 <- u[1+ncol(h1)+1:ncol(h2)] ``` The next two plots give the transformations of the predictors.

```{r neumann_plot1, fig.align = "center", echo = FALSE} plot (neumann[,1], h1%*%u1, type = "l", col = "RED", lwd = 3, xlab = "temperature", ylab = "I-spline") ```

```{r neumann_plot2, fig.align = "center", echo = FALSE} o <- order(neumann[,2]) plot (neumann[o,2],(h2%*%u2)[o], type = "l", col = "RED", lwd = 3, xlab = "pressure", ylab = "I-spline") ```

And finally, we give the residual plot, observed density versus predicted density.

```{r neumann_plot3, fig.align = "center", echo = FALSE} plot (neumann[,3], g%*%u, col = "BLUE", xlab = "observed", ylab = "predicted") ```

#Appendix: Code ##R code ###lowSpline.R ```{r file_auxilary, code = readLines("lowSpline.R")} ``` ###gslSpline.R ```{r file_auxilary, code = readLines("gslSpline.R")} ``` ###sinhaSpline.R ```{r file_auxilary, code = readLines("sinhaSpline.R")} ``` ###deboor.R ```{r file_auxilary, code = readLines("deboor.R")} ``` ##C code ###gslSpline.c ```{c file_auxilary, code = readLines("gslSpline.c"), engine='c', results='hide'} ``` ###sinhaSpline.c ```{c file_auxilary, code = readLines("sinhaSpline.c"), engine='c', results='hide'} ``` ###deboor.c ```{c file_auxilary, code = readLines("deboor.c"), engine='c', results='hide'} ``` #References