Question Details

No question body available.

Tags

r tidyr

Answers (1)

March 4, 2026 Score: 5 Rep: 8,471 Quality: Medium Completeness: 80%

The regex pattern you need requires two capture groups (): one for the pathogen column names, and one for the median|mad values:

library(tidyr)

dataexample |> pivotlonger( cols = -drugname, namesto = c("pathogen", ".value"), namespattern = "(pathogen\\d+)(median|mad)molar" )

A tibble: 4 × 4

drugname pathogen median mad

1 drug1 pathogen1 1 0.5

2 drug1 pathogen2 2 0.25

3 drug2 pathogen1 4 2

4 drug2 pathogen2 8 1

The regex works like this:

  • from the start of the column name strings ^
  • (pathogen\\d+) captures the literal string "pathogen" only if it is followed by one or more digits \\d+
  • (median|mad)molar captures any columns that have either the literal "median" or "mad" if they fall between the literals "" and "molar"

The elements in the names_to = parameter correspond, in order, to each capture group. ".value" allows you to pivot one column for each unique string found in the second capture group. In your case, values associated with column names containing "median" or "mad" get their own column.