How to aggregate in R with a custom function that uses two columns -


is possible aggregate custom function uses 2 columns return 1 column?

say have dataframe:

x <- c(2,4,3,1,5,7) y <- c(3,2,6,3,4,6) group <- c("a","a","a","a","b","b")  data <- data.frame(group, x, y) data #   group x y # 1     2 3 # 2     4 2 # 3     3 6 # 4     1 3 # 5     b 5 4 # 6     b 7 6 

and have function want use on 2 columns (x , y):

pathlength <- function(xy) {   out <- as.matrix(dist(xy))   sum(out[row(out) - col(out) == 1]) } 

i tried following aggregate:

out <- aggregate(cbind(x, y) ~ group, data, fun = pathlength)   out <- aggregate(cbind(x, y) ~ group, data, function(x) pathlength(x))   

however, calls pathlength on x , y separately instead of together, giving me:

#  group x y #1     5 8 #2     b 2 2 

what want call pathlength on x , y , aggregate way. here want aggregate do:

reala <- matrix(c(2,4,3,1,3,2,6,3), nrow=4, ncol=2) pathlength(reala) # [1] 9.964725  realb <- matrix(c(5,7,4,6), nrow=2, ncol=2) pathlength(realb) # [1] 2.828427  group <- c("a", "b")  pathlength <- c(9.964725,2.828427) real_out <- data.frame(group, pathlength) real_out #   group pathlength # 1       9.964725 # 2     b   2.828427 

does have suggestions? or there other function can't find on google let me this? i'd rather not work around using loop, i'm assuming slow big dataset.

as you've found out, base aggregate() function works on 1 column @ time. instead use by() function

by(data[,c("x","y")], data$group, pathlength) data$group: [1] 9.964725 -----------------------------------------------------------------------  data$group: b [1] 2.828427 

or split()/lapply()

lapply(split(data[,c("x","y")], data$group), pathlength) $a [1] 9.964725  $b [1] 2.828427 

Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -