Find duplicate observations in a data frame by a identifier.
Source:R/find_duplicates.R
find_duplicates.Rd
Find duplicate observations in a data frame by a identifier.
Arguments
- x
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).
- ...
<
data-masking
> Variables to group by.- sort
If
TRUE
, will show the largest groups at the top.- name
The name of the new column in the output.
If omitted, it will default to
n
. If there's already a column calledn
, it will usenn
. If there's a column calledn
andnn
, it'll usennn
, and so on, addingn
s until it gets a new name.
Examples
x <- data.frame(subject = c("a", "a", "b", "c", "c"))
find_duplicates(x, subject)
#> subject n
#> 1 a 2
#> 2 c 2
find_duplicates(mtcars, vs, am)
#> vs am n
#> 1 0 0 12
#> 2 0 1 6
#> 3 1 0 7
#> 4 1 1 7