Question Details

No question body available.

Tags

r data.table

Answers (3)

April 10, 2026 Score: 2 Rep: 1,184 Quality: Low Completeness: 60%

By total coincidence I was working today on a helper function callsacross that does exactly what you want. It pairs with the env programming mechanism in recent versions of {data.table}.

[Edit: as promised, now allowing aliasing the functions on the fly via a named character vector (which preserves GForce)]

library(data.table) set.seed(1) # please use a seed for reproducibility!!! dt #> 1: a 11 1 120 40 #> 2: b 12 2 110 10 #> 3: c 10 5 90 30

Output with verbose=TRUE:

dt[, j, by=class, env=list(j=query), verbose=TRUE] #> Argument 'by' after substitute: class #> Argument 'j' after substitute: list(val1
maximo = max(val1, na.rm = TRUE), val1minimo = min(val1, na.rm = TRUE), val2maximo = max(val2, na.rm = TRUE), val2minimo = min(val2, na.rm = TRUE)) #> [...] #> GForce optimized j to 'list(gmax(val1, na.rm = TRUE), gmin(val1, na.rm = TRUE), gmax(val2, na.rm = TRUE), gmin(val2, na.rm = TRUE))' (see ?GForce) #> Making each group and running j (GForce TRUE) ... #> [...]

Note: {data.table}'s "GForce" optimisations of common stats functions avoid (slow) split-apply-combine when there is a by, so it's important to take advantage of them.

I will change the code to allow such aliasing in the funs argument (using a named character vector). (Let me do that later and edit at some point.)[done]

Here is the source code, still a bit WIP [but now revised to allow optional aliases]:

calls
across
April 10, 2026 Score: 1 Rep: 168,691 Quality: Medium Completeness: 60%

An alternative where the function takes "all columns" (instead of individual columns).

stats2 setNames(paste0(nm, c("maximo", "minimo"))) }, frm, names(frm)) |> unname() |> do.call(cbind.data.frame, args = ) } dt[, stats2(.SD), by = class, .SDcols = c('val1', 'val2')]

class val1maximo val1minimo val2maximo val2_minimo

1: a 9 1 120 30

2: b 11 2 90 10

3: c 12 4 110 50


Reproducible data:

set.seed(42) dt
April 10, 2026 Score: 1 Rep: 2,843 Quality: Low Completeness: 60%

You need to modify stats to accept an array (one column per numeric variable) and return a simple list (not a list of lists). Keeping everything as lists (instead of arrays or matrices) within stats will preserve the names needed to identify the columns.

stats