Skip to contents

These functions are to facilitate storing just the non-duplicated elements of a distance matrix in a vector to and hdf5 file. Since the distance matrix is symmetrical and has zero's on the diagonal all the distances can be stored in a vector that contains fewer than half the values in the full matrix.

Usage

shorten_distance_matrix(dm)

expand_distance_matrix(vals)

Arguments

dm

a distance matrix, a symmetrical n x n, matrix which contains all the distances among n locations.

vals

The values from a lower triangle of a distance matrix in row major order.

Value

shorten_distance_matrix() returns a vector of values. expand_distance_matrix() returns the full matrix given that vector.

Details

shorten_distance_matrix() extracts the lower triangle in column major order from a distance matrix.

expand_distance_matrix() will reassemble the full distance matrix

The purpose is to halve the number of values being stored in the hdf5 file.

For a 5 x 5 distance matrix it would be values corresponding with the numbered cells below.

r1r2r3r4r5
c1NANANANANA
c21NANANANA
c325NANANA
c4368NANA
c547910NA

Due to the symmetry of a distance matrix that's equivalent to the upper triangle in row major order - which is probably how it will be treated in python.

c1c2c3c4c5
r1NA1234
r2NANA567
r3NANANA89
r4NANANANA10
r5NANANANANA

Examples

if (FALSE) { # \dontrun{
x <- runif(5, 1, 100)
y <- runif(5, 1, 100)
dm <- as.matrix(dist(cbind(x, y)))
a <- shorten_distance_matrix(dm)
dm2 <- expand_distance_matrix(a)
all.equal(dm, dm2, check.attributes = FALSE)
} # }