Exercises
Let's look at two other cases of using sapply vs lapply, one involving quantile and one involving unique.
qsap1 <- sapply(nyc_taxi[ , trip_metrics], quantile, probs = c(.01, .05, .95, .99), na.rm = TRUE)
qlap1 <- lapply(nyc_taxi[ , trip_metrics], quantile, probs = c(.01, .05, .95, .99), na.rm = TRUE)
(1) Query qsap1 and qlap1 for the 5th and 95th percentiles of trip_distance and trip_duration.
Let's now try the same, but this time pass the unique function to both, which returns the unique values in the data for each of the columns.
qsap2 <- sapply(nyc_taxi[ , trip_metrics], unique)
qlap2 <- lapply(nyc_taxi[ , trip_metrics], unique)
(2) Query qsap2 and qlap2 to show the distinct values of passenger_count and tip_percent. Can you tell why did sapply and lapply both return lists in the second case?
(3) Use qlap2 to find the number of unique values for each column.