Finding neighborhoods
We now write a transformation function, find_nhoods, that will find the corresponding neighborhood for pick-up and drop-off locations based on their coordinates.
Since we tested the function in the prior exercise and everything went well, we can now apply the transformation to the whole data and reasonably expect that it should work.
st <- Sys.time()
rxDataStep(nyc_xdf, nyc_xdf, overwrite = TRUE, transformFunc = find_nhoods, transformPackages = c("sp", "maptools", "rgeos"),
transformObjects = list(shapefile = nyc_shapefile))
Sys.time() - st
rxGetInfo(nyc_xdf, numRows = 5)
Time difference of 30.77251 mins
File name: C:\Data\NYC_taxi\yellow_tripdata_2016.xdf
Number of observations: 69406520
Number of variables: 29
Number of blocks: 141
Compression type: zlib
Data (5 rows starting with row 1):
VendorID tpep_pickup_datetime tpep_dropoff_datetime passenger_count trip_distance
1 2 2016-01-01 00:00:00 2016-01-01 00:00:00 2 1.10
2 2 2016-01-01 00:00:00 2016-01-01 00:00:00 5 4.90
3 2 2016-01-01 00:00:00 2016-01-01 00:00:00 1 10.54
4 2 2016-01-01 00:00:00 2016-01-01 00:00:00 1 4.75
5 2 2016-01-01 00:00:00 2016-01-01 00:00:00 3 1.76
pickup_longitude pickup_latitude RatecodeID store_and_fwd_flag dropoff_longitude
1 -73.99037 40.73470 1 N -73.98184
2 -73.98078 40.72991 1 N -73.94447
3 -73.98455 40.67957 1 N -73.95027
4 -73.99347 40.71899 1 N -73.96224
5 -73.96062 40.78133 1 N -73.97726
dropoff_latitude payment_type fare_amount extra mta_tax tip_amount tolls_amount
1 40.73241 2 7.5 0.5 0.5 0 0
2 40.71668 1 18.0 0.5 0.5 0 0
3 40.78893 1 33.0 0.5 0.5 0 0
4 40.65733 2 16.5 0.0 0.5 0 0
5 40.75851 2 8.0 0.0 0.5 0 0
improvement_surcharge total_amount tip_percent pickup_hour pickup_dow
1 0.3 8.8 0 10PM-1AM Fri
2 0.3 19.3 0 10PM-1AM Fri
3 0.3 34.3 0 10PM-1AM Fri
4 0.3 17.3 0 10PM-1AM Fri
5 0.3 8.8 0 10PM-1AM Fri
dropoff_hour dropoff_dow trip_duration pickup_nhood pickup_borough
1 10PM-1AM Fri 0 Greenwich Village New York City-Manhattan
2 10PM-1AM Fri 0 East Village New York City-Manhattan
3 10PM-1AM Fri 0 Boerum Hill New York City-Brooklyn
4 10PM-1AM Fri 0 Lower East Side New York City-Manhattan
5 10PM-1AM Fri 0 Upper East Side New York City-Manhattan
dropoff_nhood dropoff_borough
1 Gramercy New York City-Manhattan
2 <NA> <NA>
3 Yorkville New York City-Manhattan
4 <NA> <NA>
5 Midtown New York City-Manhattan