Skip to contents

Find duplicate observations in a data frame by a identifier.

Usage

find_duplicates(x, ..., sort = FALSE, name = "n")

Arguments

x

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

...

<data-masking> Variables to group by.

sort

If TRUE, will show the largest groups at the top.

name

The name of the new column in the output.

If omitted, it will default to n. If there's already a column called n, it will use nn. If there's a column called n and nn, it'll use nnn, and so on, adding ns until it gets a new name.

Examples

x <- data.frame(subject = c("a", "a", "b", "c", "c"))
find_duplicates(x, subject)
#>   subject n
#> 1       a 2
#> 2       c 2
find_duplicates(mtcars, vs, am)
#>   vs am  n
#> 1  0  0 12
#> 2  0  1  6
#> 3  1  0  7
#> 4  1  1  7