---
title: "Exceedingly Simple Monotone Regression (with Ties)"
author: "Jan de Leeuw"
date: "Version 01, April 01, 2017"
output:
pdf_document:
keep_tex: yes
number_sections: yes
toc: yes
toc_depth: 3
html_document:
keep_md: yes
number_sections: yes
toc: yes
fontsize: 12pt
graphics: yes
bibliography: jbkTies.bib
abstract: A C implementation of Kruskal's up-and-down-blocks monotone regression algorithm
for use with .C() is extended to include the three classic ways of handling ties.
It is then compared with other implementations.
---
```{r function_code, echo = FALSE}
source("jbkTies.R")
mprint <- function (x,
d = 6,
w = 8,
o = "f",
f = "") {
print (noquote (formatC (
x,
di = d,
wi = w,
fo = o,
flag = f
)))
}
```
```{r packages, echo = FALSE}
options (digits = 10)
```
Note: This is a working paper which will be expanded/updated frequently. All suggestions for improvement are welcome. The directory [gifi.stat.ucla.edu/jbkTies](http://gifi.stat.ucla.edu/jbkTies) has a pdf version, the bib file, the complete Rmd file with the code chunks, and the R source code.
#Introduction
In a recent Rpub (@deleeuw_E_17g) we presented a `.C()` version of Kruskal's up-and-down-blocks algorithm for monotone (or isotone) regression. In @deleeuw_E_16d there is R
code that extends monotone regression by adding the three classical ways of handling ties. See @kruskal_64b for the primary and secondary approach, and @deleeuw_A_77 for the tertiary approach, as well as for proofs that all three methods are indeed optimal. The R code for tie handling in @deleeuw_E_16d uses lists for blocks of ties, and includes a number of loops over blocks. Thus there is some room for improvement by implementing tie-handling using `.C()`.
The appendix of this follow-up paper presents `.C()` code for monotone regression with the three approaches. We have chosen to make it modular, in the sense that the basic
operations all have a separate routine written in C, with correspondng R wrapper. All C routines are written using `.C()` conventions, i.e. they pass by reference and
return a void. Internally they do not use any R-based functions and include files, so they can be easily used in other contexts as well. Thus extra storage need in the
routines is allocated and freed internally, without involving R.
#Small Example
```{r example}
x <- c(2.1, 2.1, 3.5, 1.9, 3.5, 3.5, 1.9, 2.1, 1.9)
y <- c(2, 1, 6, 5, 4, 7, 8, 9, 3)
w <- c(1, 1, 2, 2, 2, 2, 2, 1, 1)
```
##Sorting
We use the function `mySortDouble()`, a `.C()` interface to a C routine of the same name, to sort x and to put y and w in the correct order. In R parlance, the routine returns
`order(x)` as well as `sort (x) = x[order (x)]` , `y[order (x)]` and `w[order (x)]`. The sorting routine is C code taken from @deleeuw_E_17f. It uses the system `qsort()`
routine, which implements quicksort. Quicksort is not a stable sorting routine, in the sense that it does not guarantee that tieblocks will be sorted in a unique way. This, however, is not a problem in the monotone regression context, because that form of instability will not change the outcome of the regression routine.
```{r sort_double, echo = FALSE}
h <- mySortDouble (x, y, w)
mprint(h$x, d = 1, w = 3, o = "f")
mprint(h$y, d = 1, w = 3, o = "d")
mprint(h$w, d = 1, w = 3, o = "d")
mprint(h$xind, d = 1, w = 3, o = "d")
```
The function `tieBlock()`, again a `.C()` interface to a C routine of the same name, uses the sorted x to return the tie-blocks (a vector of the same length as x, using integers to indicate block membership).
```{r tieblock, echo = FALSE}
g <- tieBlock (h$x)
mprint(g$iblks, d = 1, w = 3, o = "d")
```
We emphasize that in a non-metric multidimensional scaling context (or, more generally, an alternating least squares context) the sorting only has to be done only once, not in every iteration. Thus in repeated calls of the algorithm this first part can be skipped.
##Primary Approach
The primary approach to ties starts with sorting y within blocks. This is done with `sortBlocks()`, again a `.C()` interface. Along with y we sort w and the order vector.
###Sort within Blocks
```{r sortblocks, echo = FALSE}
f <- sortBlocks (h$y, h$w, h$xind, g$iblks)
mprint(f$y, d = 1, w = 3, o = "d")
mprint(f$w, d = 1, w = 3, o = "d")
mprint(f$xind, d = 1, w = 3, o = "d")
```
The newly sorted y and w are entered into `jbkPava()`, the monotone regression routine, again a `.C()` interface to C code.
###Monotone Regression
```{r jbkpava, echo = FALSE}
e <- jbkPava (f$y, f$w)
mprint(e$x, d = 3, w = 5, o = "f")
```
###Restore Order
Finally `mySortInteger()` ,another `.C()` interface, is used to find the inverse permutation that puts the monotone regression output back in the correct order.
```{r mysortinteger, echo = FALSE}
c <- mySortInteger (f$xind)
mprint(c$xind, d = 1, w = 3, o = "d")
mprint(e$x[c$xind], d = 3, w = 5, o = "f")
```
###Residual Sum of Squares
Equal to `r sum((y - e$x[c$xind]) ^ 2)`.
##Secondary Approach
###Make Blocks
In the secondary approach we use `makeBlocks()`, another `.C()` interface, to create block averages and block weights.
```{r makeblocks, echo = FALSE}
b <- makeBlocks (h$y, h$w, g$iblks)
mprint (b$xblks, d = 3, w = 5, o = "f")
mprint (b$wblks, d = 3, w = 5, o = "f")
```
###Monotone Regression
Then we perform monotone regression on the block values using block weights.
```{r monotwo, echo = FALSE}
a <-jbkPava (b$xblks, b$wblks)
mprint(a$x, d = 3, w = 5, o = "f")
```
###Expand and Restore Order
Finally we use `mySortInteger()`, another `.C()` sorting routine, to expand the monotone regression values in the correct order to a vector of `length (x)`.
```{r mysorttwo,echo = FALSE}
i <- mySortInteger(h$xind)
mprint ((a$x[g$iblks])[i$xind], d = 3, w = 5, o = "f")
```
###Residual Sum of Squares
Equal to `r sum((y - (a$x[g$iblks])[i$xind]) ^ 2)`.
##Tertiary Approach
###Adjust Means
The tertiary approach forms blocks and does a monotone regression as in the secondary approach. It then adjusts the block means optimally.
```{r tertiary, echo = FALSE}
u <- h$y - (b$xblks - a$x)[g$iblks]
mprint (u, d = 3, w = 5, o = "f")
```
###Restore Order
```{r mysortthree, echo = FALSE}
mprint(u[i$xind], d = 3, w = 5, o = "f")
```
###Residual Sum of Squares
Equal to `r sum((y - u[i$xind]) ^ 2)`.
#Timings
We compare two different implementations of monotone regression with ties. The first is `monregR()`, which does the data-handling and tie-handling in R, using lists. It is basically the code from @deleeuw_E_16d, with two changes. In the first place the
Fortran code for monotone regression from @cran_80 is replaced by the C code from @deleeuw_E_17g. Secondly, the R code in @deleeuw_E_16d does not work correctly in the case of unequal weights, and we have corrected that in the current version of `monregR()`.
The second function is `monregRC()`, which does data-handling, tie-handling, and monotone regression in C, using the functions we have illustrated earlier in our small example.
In our comparion we use
```{r data, eval = FALSE}
x <- sample (1:blks, n, replace = TRUE)
y <- rnorm (n)
```
where n = 10,000. Since we sample with equal probabilities from `1:blks` the vector x will have (at most) blks different values, so if blks is small there will be many ties,
if blks is large there will be only a few ties. We let blks take the values `c(2,10,100,1000,10000)`. Combined with the three options for tie-handling this gives 15
different analyses, and each one is repeated 100 times. We report the elapsed time from `system.time()`.
```{r runner_txt, echo = FALSE, cache = TRUE}
f <- function (n, m, blks, ties) {
for (j in 1:m) {
x <- sample (1:blks, n, replace = TRUE)
y <- rnorm (n)
h <- monregR (x, y, ties = ties)
}
}
g <- function (n, m, blks, ties) {
for (j in 1:m) {
x <- sample (1:blks, n, replace = TRUE)
y <- rnorm (n)
h <- monregRC (x, y, ties = ties)
}
}
n <- 10000
m <- 100
ties <- c(1, 2, 3)
blks <- c(2, 10, 100, 1000, 10000)
x <- matrix (0, 5, 3)
row.names(x)<-as.character(blks)
colnames(x)<-as.character(ties)
y <- matrix (0, 5, 3)
row.names(y)<-as.character(blks)
colnames(y)<-as.character(ties)
set.seed(12345)
for (i in 1:5) {
for (j in 1:3) {
x[i,j] <- system.time(f(n, m, blks[i], ties[j]))[3]
y[i,j] <- system.time(g(n, m, blks[i], ties[j]))[3]
}
}
```
Results for `monregR()` are
```{r result_x, echo = FALSE}
mprint(x, w = 6, d = 4, o = "f")
```
Results for `monregRC()` are
```{r result_y, echo = FALSE}
mprint(y, w = 6, d = 4, o = "f")
```
Choosing n = 10,000 is similar to what we used in @deleeuw_E_17g, but in a typical multidimensional scaling context n = 500 is perhaps more realistic.
```{r realistic, echo = FALSE, cache = TRUE}
n <- 500
blks <- c(2, 10, 100, 250, 500)
row.names(x)<-as.character(blks)
row.names(y)<-as.character(blks)
set.seed(12345)
for (i in 1:5) {
for (j in 1:3) {
x[i,j] <- system.time(f(n, m, blks[i], ties[j]))[3]
y[i,j] <- system.time(g(n, m, blks[i], ties[j]))[3]
}
}
```
Now the results for `monregR()` are
```{r result_x_real, echo = FALSE}
mprint(x, w = 6, d = 4, o = "f")
```
and those for `monregRC()` are
```{r result_y_real, echo = FALSE}
mprint(y, w = 6, d = 4, o = "f")
```
#Conclusion
Running time for `monregR()` heavily depends on the number of blocks, because the R code has loops over blocks. For a small number of blocks `monregR()` is even faster than `monregRC()`. If n= 10,000 then up to five times as fast for an expected block size of 5000 (two blocks). But for more blocks `monregR()` rapidly loses its advantage. For
n = 500 we find that `monregRC()` is almost always faster, and usually much faster.
#Code
##R code
```{r file_auxilary, code = readLines("jbkTies.R")}
```
##C code
```{c c_code, engine='c', results='hide'}
#include
#include
#include
struct block {
double value;
double weight;
int size;
int previous;
int next;
};
struct quadruple {
double value;
double result;
double weight;
int index;
};
struct triple {
double value;
double weight;
int index;
};
struct pair {
int value;
int index;
};
int myCompDouble (const void *, const void *);
int myCompInteger (const void *, const void *);
void mySortDouble (double *, double *, double *, int *, const int *);
void mySortInteger (int *, int *, const int *);
void mySortInBlock (double *, double *, int *, int *);
void tieBlock (double *, int *, const int *, int *);
void makeBlocks (double *, double *, double *, double *, int *, const int *, const int *);
void sortBlocks (double *, double *, int *, const int *, const int *, const int *);
void jbkPava (double *, double *, const int *);
int myCompDouble (const void *px, const void *py) {
double x = ((struct quadruple *)px)->value;
double y = ((struct quadruple *)py)->value;
return (int)copysign(1.0, x - y);
}
int myCompInteger (const void *px, const void *py) {
int x = ((struct pair *)px)->value;
int y = ((struct pair *)py)->value;
return (int)copysign(1.0, x - y);
}
void mySortInBlock (double *x, double *w, int *xind, int *n) {
int nn = *n;
struct triple *xi =
(struct triple *) calloc((size_t) nn, (size_t) sizeof(struct triple));
for (int i = 0; i < nn; i++) {
xi[i].value = x[i];
xi[i].weight = w[i];
xi[i].index = xind[i];
}
(void) qsort(xi, (size_t)nn, (size_t)sizeof(struct triple), myCompDouble);
for (int i = 0; i < nn; i++) {
x[i] = xi[i].value;
w[i] = xi[i].weight;
xind[i] = xi[i].index;
}
free(xi);
return;
}
void mySortDouble (double *x, double *y, double *w, int *xind, const int *n) {
int nn = *n;
struct quadruple *xi =
(struct quadruple *) calloc((size_t) nn, (size_t) sizeof(struct quadruple));
for (int i = 0; i < nn; i++) {
xi[i].value = x[i];
xi[i].result = y[i];
xi[i].weight = w[i];
xi[i].index = i + 1;
}
(void) qsort(xi, (size_t)nn, (size_t)sizeof(struct quadruple), myCompDouble);
for (int i = 0; i < nn; i++) {
x[i] = xi[i].value;
y[i] = xi[i].result;
w[i] = xi[i].weight;
xind[i] = xi[i].index;
}
free(xi);
return;
}
void mySortInteger (int *x, int *k, const int *n) {
int nn = *n;
struct pair *xi =
(struct pair *) calloc((size_t) nn, (size_t) sizeof(struct pair));
for (int i = 0; i < nn; i++) {
xi[i].value = x[i];
xi[i].index = i + 1;
}
(void) qsort(xi, (size_t)nn, (size_t)sizeof(struct pair), myCompInteger);
for (int i = 0; i < nn; i++) {
x[i] = xi[i].value;
k[i] = xi[i].index;
}
free(xi);
return;
}
void tieBlock (double *x, int *iblks, const int *n, int *nblk) {
iblks[0] = 1;
for (int i = 1; i < *n; i++) {
if (x[i - 1] == x[i]) {
iblks[i] = iblks[i - 1];
} else {
iblks[i] = iblks[i - 1] + 1;
}
}
*nblk = iblks[*n - 1];
return;
}
void makeBlocks (double *x, double *w, double *xblks, double *wblks, int *iblks, const int *n, const int *nblk) {
for (int i = 0; i < *nblk; i++) {
xblks[i] = 0.0;
wblks[i] = 0.0;
}
for (int i = 0; i < *n; i++) {
xblks [iblks [i] - 1] += w[i] * x[i];
wblks [iblks [i] - 1] += w[i];
}
for (int i = 0; i < *nblk; i++) {
xblks[i] = xblks[i] / wblks[i];
}
return;
}
void sortBlocks (double *y, double *w, int *xind, const int *iblks, const int *n, const int *nblk) {
int *nblks = (int *) calloc((size_t) * nblk, sizeof(int));
for (int i = 0; i < *n; i++) {
nblks[iblks[i] - 1]++;
}
int k = 0;
for (int i = 0; i < *nblk; i++) {
int nn = nblks[i];
(void) mySortInBlock (y + k, w + k, xind + k, &nn);
k += nn;
}
free (nblks);
return;
}
void jbkPava (double *x, double *w, const int *n) {
struct block *blocks = calloc ((size_t) * n, sizeof(struct block));
for (int i = 0; i < *n; i++) {
blocks[i].value = x[i];
blocks[i].weight = w[i];
blocks[i].size = 1;
blocks[i].previous = i - 1; // index first element previous block
blocks[i].next = i + 1; // index first element next block
}
int active = 0;
do {
bool upsatisfied = false;
int next = blocks[active].next;
if (next == *n) upsatisfied = true;
else if (blocks[next].value > blocks[active].value) upsatisfied = true;
if (!upsatisfied) {
double ww = blocks[active].weight + blocks[next].weight;
int nextnext = blocks[next].next;
blocks[active].value = (blocks[active].weight * blocks[active].value + blocks[next].weight * blocks[next].value) / ww;
blocks[active].weight = ww;
blocks[active].size += blocks[next].size;
blocks[active].next = nextnext;
if (nextnext < *n)
blocks[nextnext].previous = active;
blocks[next].size = 0;
}
bool downsatisfied = false;
int previous = blocks[active].previous;
if (previous == -1) downsatisfied = true;
else if (blocks[previous].value < blocks[active].value) downsatisfied = true;
if (!downsatisfied) {
double ww = blocks[active].weight + blocks[previous].weight;
int previousprevious = blocks[previous].previous;
blocks[active].value = (blocks[active].weight * blocks[active].value + blocks[previous].weight * blocks[previous].value) / ww;
blocks[active].weight = ww;
blocks[active].size += blocks[previous].size;
blocks[active].previous = previousprevious;
if (previousprevious > -1)
blocks[previousprevious].next = active;
blocks[previous].size = 0;
}
if ((blocks[active].next == *n) && downsatisfied) break;
if (upsatisfied && downsatisfied) active = next;
} while (true);
int k = 0;
for (int i = 0; i < *n; i++) {
int blksize = blocks[i].size;
if (blksize > 0.0) {
for (int j = 0; j < blksize; j++) {
x[k] = blocks[i].value;
k++;
}
}
}
free (blocks);
}
```
#References