R: Keep only data from the first occurring row when there are duplicates in certain column within the same row

0 comments

snippets2 years agoPeakD

I want to keep only data from the first occurring row when there are duplicates in certain column within the same row.

Suppose we have a dataframe like this:

df <- data.frame(
A = c(99, 99, 99, 100, 101),
B = c("apple", "orange", "apple", "banana", "apple"),
C = c("eaten", "eaten", "fresh", "eaten", "fresh")
) # create sample dataframe

I want to drop the second and third rows because I only want to keep the first record for cases duplicated with 99, while retaining the rest.

This is how the code looks like with the use of distinct(), where the duplicates are removed after the first row of its occurrence.

df_unique <- df %>%
distinct(A, .keep_all = TRUE) # Remove duplicate rows based on A column

Hashtags 7

A general topic community built around PoB technology and the POB token

stemgeeks

datascience

rstats

$ 0.11

Comments

Sort byBest