Solutions
(1) Because qsap1 is a matrix, we can query it the same way we query any n-dimensional array:
qsap1[c('5%', '95%'), c('trip_distance', 'trip_duration')]
trip_distance trip_duration
5% 0.5 178
95% 10.2 2038
Since qlap1 is a list with one element per each column of the data, we use two brackets to extract the percentiles for column separately. Moreover, because the percentiles themselves are stored in a named vector, we can pass the names of the percentiles we want in a single bracket to get the desired result.
qlap1[['trip_distance']][c('5%', '95%')]
5% 95%
0.5 10.2
qlap1[['trip_duration']][c('5%', '95%')]
5% 95%
178 2038
(2) In this case, sapply and lapply both return a list, simply because there is no other way for sapply to organize the results. We can just return the results for passenger_count and tip_percent as a sublist.
qsap2[c('passenger_count', 'tip_percent')]
$passenger_count
[1] 5 1 2 6 3 4 0 9 7 8
$tip_percent
[1] 23 0 17 2 12 6 21 18 20 16 13 19 1 7 14 10 22 11 25 8 15 5 9 3 26 24
[27] 4 NA 30 36 35 28 33 54 58 27 34 31 29 32 66 70 47 99 40 37 82 57 45 46 44 50
[53] 55 43 65 38 60 42 76 90 41 53 64 61 51 73 49 83 71 81 62 80 86 94 72 87 56 63
[79] 88 52 93 48 39 84 92 91 79 74 75 78 68 89 96 67 69 97 85 59 95 98 77
(3) Since we have the unique values for each column stored in qlap2, we can just run the length function to count how many unique values each column has. For example, for passenger_count we have
length(qlap2[['passenger_count']]) # don't forget the double bracket here!
[1] 10
But we want to do this automatically for all the columns at once. The solution is to use sapply. So far we've been using sapply and lapply with the dataset as input. But we can just as well feed them any random list like qsap and apply a function to each element of that list (as long as doing so doesn't result in an error for any of the list's elements).
sapply(qlap2, length)
passenger_count trip_distance fare_amount tip_amount trip_duration
10 3632 1162 2957 8965
tip_percent
101
The above exercise offers a glimpse of how powerful R can be and quickly and succinctly processing the basic data types, as long as we write good functions and use the apply family of functions to iterate through the data types. A good goal to set for yourself as an R programmer is to increase your reliance on the apply family of function to run your code.