Question Details

No question body available.

Tags

r data-manipulation

Answers (3)

February 6, 2026 Score: 3 Rep: 105,922 Quality: Medium Completeness: 50%

With base R, you can try cummax within ave

> transform(df,new = ave(value >= 99, id, FUN = cummax)) id date time cost value new 1 1 2026-01-24 10:52:44 234 34 0 2 1 2026-01-24 10:54:44 236 68 0 3 1 2026-01-24 11:05:41 353 99 1 4 1 2026-01-25 11:52:44 5352 78 1 5 1 2026-01-25 11:58:14 143 99 1 6 2 2026-01-24 10:02:44 124 99 1 7 2 2026-01-24 10:22:44 636 99 1 8 3 2026-01-24 10:52:12 53 75 0 9 3 2026-01-25 10:52:44 436 99 1 10 3 2026-01-25 11:12:23 9473 13 1
February 6, 2026 Score: 2 Rep: 18,811 Quality: Low Completeness: 50%

In dplyr, you can use cumany, which explicitly identifies all cases after the first TRUE. The + turns it from a boolean to a 1/0/integer

df %>% mutate(new = +(cumany(value == 99)), .by = id)

Output:

# id date time cost value new

1 1 2026-01-24 10:52:44 234 34 0

2 1 2026-01-24 10:54:44 236 68 0

3 1 2026-01-24 11:05:41 353 99 1

4 1 2026-01-25 11:52:44 5352 78 1

5 1 2026-01-25 11:58:14 143 99 1

6 2 2026-01-24 10:02:44 124 99 1

7 2 2026-01-24 10:22:44 636 99 1

8 3 2026-01-24 10:52:12 53 75 0

9 3 2026-01-25 10:52:44 436 99 1

10 3 2026-01-25 11:12:23 9473 13 1

February 6, 2026 Score: 1 Rep: 11,996 Quality: Medium Completeness: 60%

You could cummulative sum the condition value == 99 per group and check if it is >= 1 and if so, turn the boolean to integer using +.

dplyr::mutate(X, new = +(cumsum(value == 99) >= 1), .by = id)

or

transform(X, new = ave(value, id, FUN = \(x) +(cumsum(x == 99) >= 1)))

or

library(data.table) setDT(X)[, new := +(cumsum(value == 99) >= 1), by = id][]

or as @skbaldur pointed out

setDT(X)[, new := cummax(value == 99), by = id][]

or collapse

collapse::ftransform(X, new = +(collapse::fcumsum(value == 99, g = id) >= 1)

giving

# id          date     time cost value new

1 1 2026-01-24 10:52:44 234 34 0

2 1 2026-01-24 10:54:44 236 68 0

3 1 2026-01-24 11:05:41 353 99 1

4 1 2026-01-25 11:52:44 5352 78 1

5 1 2026-01-25 11:58:14 143 99 1

6 2 2026-01-24 10:02:44 124 99 1

7 2 2026-01-24 10:22:44 636 99 1

8 3 2026-01-24 10:52:12 53 75 0

9 3 2026-01-25 10:52:44 436 99 1

10 3 2026-01-25 11:12:23 9473 13 1

data

X = 1), .by = id),
  dplyrcumany = dplyr::mutate(X, new = +(dplyr::cumany(value == 99)), .by = id),
  avecummax = transform(X,new = ave(value >= 99, id, FUN = cummax)),
  avecumsum = transform(X,new = ave(value, id, FUN = \(x) +(cumsum(x == 99) >= 1))),
  collapse = collapse::ftransform(X, new = +(collapse::fcumsum(value == 99, g = id) >= 1)),
  m = transform(X, new = ave(value == 99, id, FUN = \(x) as.integer(Reduce(|, x, accumulate = TRUE)))),
  iterations = 5000
) |> dplyr::arrange(median)

A tibble: 6 × 13

expression min median itr/sec mem
alloc gc/sec nitr ngc totaltime result memory time gc

1 collapse 23µs 24.8µs 37016. 1.75KB 14.8 4998 2 135.02ms 2 ave
cummax 151.9µs 161µs 6105. 16.19KB 12.2 4990 10 817.42ms 3 avecumsum 156.3µs 167.7µs 5504. 16.19KB 12.1 4989 11 906.36ms 4 m 181.7µs 192.5µs 5025. 16.19KB 13.1 4987 13 992.44ms 5 dplyrcumany 1.34ms 1.38ms 716. 20.38KB 10.9 4925 75 6.87s 6 dplyr_cumsum 1.34ms 1.38ms 715. 20.38KB 11.0 4924 76 6.89s