I’ve recently been playing around with a measure of betweenness as a way of generating cartographic street hierarchies specific to a given transport mode.
I have a few broad goals here:
Betweenness is essentially a measure of how often an edge/node in a network belongs to a shortest path across that network, when some large/exhaustive number of shortest paths are calculated. In the context of geographic street networks, it’s well established that different modes (car, bike, etc) have different criteria in the general case for determining an optimal path. These different routing criteria should produce different shortest paths and in the aggregate, betweenness measures, as indeed they do.
What follows are three maps of the same area in Cincinnati, showing betweenness measures which I generated for cars, cyclists, and pedestrians. I’ll explain more about how they were created in a moment, but for now, just know that line thickness in these maps is scaled according to the square root of ( one plus my betweenness measure ).
For cars:
For bicycles:
For pedestrians:
One thing you may notice right off the bat is that some paths are off limits to one or more of the modes. The next is that the car map is probably the most ‘normal’ looking of these. You can kind of see that by looking at the standard OpenStreetMap base map for the same area.
What appears to me is that essentially the same streets stand out here as in the car map, above. in fact though, I can offer a somewhat more precise comparison of these two maps if you’re not quite convinced. OpenStreetMap defines highway tags in a clear hierarchical order and it’s possible to correlate such ordinal values (ranked as ‘motorway’=10, ‘trunk’=9, ‘primary’=8 and so on down to ‘service’=1) directly with the betweenness measures calculated for the betweenness maps shown above. When considering all of the Cincinnati metro area within 30km of downtown, id est:
for which I happen to have calculated my betweenness measures, I get, as a rough back of the envelope calculation, the following (edge length weighted) Spearman rank order correlation coefficients:
If I exclude the obvious car-only, limited-access motorway and trunk roads from the calculation, I still get:
If I were to include bike/ped-only paths, I would probably only push these numbers lower for bikes and peds. It seems clear that the standard OSM hierarchy is car-oriented. Is this kind of a “duh” statement? Perhaps, but it’s nice to have some evidence.
Now! How did I calculate these betweenness measures?
The first problem is to decide which points to route between. The standard graph-theory approach is a full every-node to every-node combinatorical explosion. In abstract graph theory, this has the result of producing higher measures for nodes which are more central, which is useful for a non-spatial graph like a social network (in which Kevin Bacon surely has betweenness=∞). However since I’m looking at a metro area, defined by some arbitrary boundary of my choosing, the most central areas would be highly determined by the arbitrary boundaries of the graph. To get around this problem, I placed a distance constraint on the paths I would generate, based on quasi-realistic travel scenarios. People don’t walk as far as they bike and they don’t bike as far as they drive. Let’s say I limit foot-routes to between 0.5 and 1.5 km, bike trips to between 1 and 6 km, and driving trips to between 1.5 and 20km; this excludes most paths from the combinatorial space, and negates edge effects across most of the area of interest. If you have a 10km max trip length, then you just need to include a 10km buffer around your area of interest.
There is still however a problem of weighting. Do I route between intersections (as the standard graph theory approach would suggest)? This would underweight long segments which dominate more rural areas, and generally wouldn’t make sense in this context. Do I weight according to population density since humans are the only ones who actually make any trips? I tried this and quickly realized that I had reproduced a population density map. Next I tried something less applicable to the real world and perhaps more applicable to the cartographic problem at hand: weighting by edge length. In this approach, origins and destinations have an equal probability of appearing anywhere on the street network, as if you stretched all the streets out end to end and randomly picked a point somewhere between the extremes.
Random points were selected in this way, essentially one pair at a time. This pair was checked against the distance criteria for the mode, and the route was found using OSRM and the default routing profiles for cars bikes and pedestrians respectively. If the route produced was way too long (e.g. points on opposite sides of the river with no nearby bridge), then it was ignored. This may not produce a realistic trip distance distribution, but it did seem to produce pretty pleasing results for now. Some improvement will be needed in this area. For each mode, somewhere around a million routes were calculated for one metro area and it’s surrounding buffer.
Anyhoo, OSRM is able to return a vector of the OSM node ID’s along the length of the route. By keeping counts for all the segments, it’s easy to reconstruct a geometry table with counts for each mode. That’s how the above maps were generated. I may be posting code eventually if there is any interest in this.
I hope to be using this technique more in the near future, perhaps to make a start on a Toronto bike map or something of that nature. My thoughts are a bit of a jumble at the moment, having been working on this project through a couple months of fits and starts with an eye toward a poster for the now completed NACIS 2017. Hopefully soon I’ll have something more coherent to say about where this effort may be going.