Waivio

R: Keep only data from the first occurring row when there are duplicates in certain column within the same row

0 comments

snippets2 years agoPeakD

I want to keep only data from the first occurring row when there are duplicates in certain column within the same row.

Suppose we have a dataframe like this:

df <- data.frame(
A = c(99, 99, 99, 100, 101),
B = c("apple", "orange", "apple", "banana", "apple"),
C = c("eaten", "eaten", "fresh", "eaten", "fresh")
) # create sample dataframe

I want to drop the second and third rows because I only want to keep the first record for cases duplicated with 99, while retaining the rest.

This is how the code looks like with the use of distinct(), where the duplicates are removed after the first row of its occurrence.

df_unique <- df %>%
distinct(A, .keep_all = TRUE) # Remove duplicate rows based on A column

Screenshot 2023-10-10 at 2.37.04 PM.png

snippets.png


Hashtags 7
A general topic community built around PoB technology and the POB token

Comments

Sort byBest