gtfs2gpshas a set of functions to convert public transport GTFS data to GPS-like format using
data.table. It also has some functions to convert both representations to simple feature format.
gtfs2gps allows users to convert public
transport GTFS data into a single
data.table format with
GPS-like records, which can then be used in various applications such as
running transport simulations or scenario analyses. Before using the
package, just install it from GitHub.
After loading the package, GTFS data can be read into R by using
read_gtfs(). This function gets a zipped GTFS file and
returns a list of
data.table objects. The returning list
contains the data of each GTFS file indexed according to their file
names without extension.
library("gtfs2gps") #> gtfs2gps version 2.1-1 is now loaded #> NOTE: All filter functions from gtfs2gps were removed #> Please replace them by similar functions from gtfstools <- read_gtfs(system.file("extdata/poa.zip", package ="gtfs2gps")) poa #> Unzipped the following files to C:\Users\pedro\AppData\Local\Temp\RtmpKIEurg/gtfsio: #> * agency.txt #> * calendar.txt #> * routes.txt #> * shapes.txt #> * stop_times.txt #> * stops.txt #> * trips.txt #> Reading agency.txt #> Reading calendar.txt #> Reading routes.txt #> Reading shapes.txt #> Reading stop_times.txt #> Reading stops.txt #> Reading trips.txt names(poa) #>  "agency" "calendar" "routes" "shapes" "stop_times" #>  "stops" "trips" head(poa$trips) #> route_id service_id trip_id shape_id #> 1: T2 T2@1 T2-1@1#520 T2-1 #> 2: T2 T2@1 T2-1@1#540 T2-1 #> 3: T2 T2@1 T2-1@1#555 T2-1 #> 4: T2 T2@1 T2-1@1#610 T2-1 #> 5: T2 T2@1 T2-1@1#620 T2-1 #> 6: T2 T2@1 T2-1@1#628 T2-1
Note that not all GTFS files are loaded into R. This function only
loads the necessary data to spatially and temporally handle trips and
stops, which are: “shapes.txt”, “stop_times.txt”, “stops.txt”,
“trips.txt”, “agency.txt”, “calendar.txt”, “routes.txt”, and
“frequencies.txt”, with this last four being optional. If a given GTFS
zipped file does not contain all of these required files then
read_gtfs() will stop with an error.
In the code below we filter only the shape ids `c(“T2-1”, “A141-1”) to allow faster execution the next scripts.
library(magrittr) object.size(poa) %>% format(units = "Kb") #>  "1145.7 Kb" <- gtfstools::filter_by_shape_id(poa, c("T2-1", "A141-1")) poa_small object.size(poa_small) %>% format(units = "Kb") #>  "559.5 Kb"
We can then easily convert the data to simple feature format and plot them.
<- gtfs2gps::gtfs_shapes_as_sf(poa_small) poa_small_shapes_sf <- gtfs2gps::gtfs_stops_as_sf(poa_small) poa_small_stops_sf plot(sf::st_geometry(poa_small_shapes_sf)) plot(sf::st_geometry(poa_small_stops_sf), pch = 20, col = "red", add = TRUE) box()
After subsetting the data, it is also possible to save it as a new
GTFS file using
write_gtfs(), as shown below.
<- tempfile(pattern = 'poa_small', fileext = '.zip') temp_gtfs ::write_gtfs(poa_small, temp_gtfs)gtfs2gps
To convert GTFS to GPS-like format, use
is the core function of the package. It takes a GTFS zipped file as an
input and returns a
data.table where each row represents a
‘GPS-like’ data point for every trip in the GTFS file. In summary, this
function interpolates the space-time position of each vehicle in each
trip considering the network distance and average speed between stops.
The function samples the timestamp of each vehicle every \(15m\) by default, but the user can set a
different value in the
spatial_resolution argument. See the
<- gtfs2gps(temp_gtfs, spatial_resolution = 100) poa_gps head(poa_gps) #> shape_id trip_id route_type id timestamp shape_pt_lon shape_pt_lat #> 1: A141-1 A141-1@1#30 3 1 <NA> -51.14692 -30.14979 #> 2: A141-1 A141-1@1#30 3 2 <NA> -51.14610 -30.15014 #> 3: A141-1 A141-1@1#30 3 3 00:30:00 -51.14570 -30.15031 #> 4: A141-1 A141-1@1#30 3 4 00:30:00 -51.14570 -30.15031 #> 5: A141-1 A141-1@1#30 3 5 00:30:29 -51.14493 -30.15064 #> 6: A141-1 A141-1@1#30 3 6 00:30:58 -51.14417 -30.15097 #> stop_id stop_sequence speed dist cumdist #> 1: <NA> NA 1.000000e-12 [km/h] 0.00000 [m] 0.00000 [m] #> 2: <NA> NA NA [km/h] 87.23608 [m] 87.23608 [m] #> 3: 434 1 NA [km/h] 43.32548 [m] 130.56155 [m] #> 4: 434 1 1.000000e-12 [km/h] 0.00000 [m] 130.56155 [m] #> 5: <NA> NA 1.013763e+01 [km/h] 82.11437 [m] 212.67592 [m] #> 6: <NA> NA 1.013763e+01 [km/h] 82.11437 [m] 294.79029 [m] #> cumtime trip_number #> 1: 0.00000 [s] 1 #> 2: NA [s] 1 #> 3: NA [s] 1 #> 4: 0.00000 [s] 1 #> 5: 29.15983 [s] 1 #> 6: 58.31967 [s] 1
The following figure maps the first 100 data points of the sample
data we processed. They can be converted to
points or linestring.
<- poa_gps[1:100, ] poa_gps60 # points <- gps_as_sfpoints(poa_gps60) poa_gps60_sfpoints # linestring <- gps_as_sflinestring(poa_gps60) poa_gps60_sflinestring # plot plot(sf::st_geometry(poa_gps60_sfpoints), pch = 20) plot(sf::st_geometry(poa_gps60_sflinestring), col = "blue", add = TRUE) box()
gtfs2gps() automatically recognizes whether
the GTFS data brings detailed
stop_times.txt information or
whether it is a
frequency.txt GTFS file. A sample data of a
GTFS with detailed
stop_times.txt cab be found below:
<- system.file("extdata/poa.zip", package ="gtfs2gps") poa <- gtfs2gps(poa, spatial_resolution = 50) poa_gps <- gps_as_sfpoints(poa_gps) poa_gps_sflinestrig plot(sf::st_geometry(poa_gps_sflinestrig[1:200,])) box()
For a given trip, the function
gtfs2gps calculates the
average speed between each pair of consecutive stops given by the ratio
between cumulative network distance
S and departure time
t for a consecutive pair of valid stop_ids
Since the beginning of each trip usually starts before the first
stop_id, the mean speed cannot be calculated as shown in the previous
equation because information on
i period does not exist. In
this case, the function consider the mean speed for the whole trip. It
also happens after the last valid stop_id (
N) of the trips,
where info on
i + 1 also does not exist.
If you have any suggestions or want to report an error, please visit the GitHub page of the package here.