The San Francisco Shortest Path Dataset .
The San Francisco Shortest Path Dataset .2018, Martin Werner

San Francisco Shortest Paths


The Shortest Path In San Francisco dataset is a synthetic dataset of shortest paths around San Francisco. These shortest paths have been calculated using random start and end points, weight maps derived from distance and from travel time, and random small polygonal obstructions which led individual paths avoid certain small regions. It contains 20,242 trajectories in total containing about 5 million points.

You can download it here

It has been created for the ACM SIGSPATIAL GIS Cup 2017 on Range Queries under Fréchet distance and is given as a set of trajectories in Global Web Mercator (EPSG:3857). It is based on the ACM SIGSPATIAL GIS Cup submission winning the ACM SIGSPATIAL 2015 competition (Werner, 2015) in which shortest paths under polygonal constraints can be extracted.

The data is derived from OpenStreetMap and we republish the derived data under identical terms, that is following the Open Data Commons Database License (ODbl). For details, see

When you use it in your scientific works, we encourage you to

  • preferably link to the DOI or alternatively
  • link to this web page to enable readers to easily access the data, and to
  • cite this dataset by citing the article of the SIGSPATIAL cup for which it has been created (Werner & Oliver, 2018)


Example Usage

To get you started, the following python snippet creates a list of all trajectories formatted as individual numpy arrays. Therefore, it parses the TGZ file. It is rather slow and can only be used when you are importing into your envisaged temporary work format.

import urllib
from os.path import isfile
import tarfile
import numpy as np;
from tqdm import tqdm;
from matplotlib import pyplot as plt;

if __name__=="__main__":
    print("Checking if data exists")
    if not isfile('shortest-sf.tgz'):
        print("Downloading... ")
        urllib.urlretrieve ("", "shortest-sf.tgz")
        print("Found local file")

    #unzip all files
    dataset ='shortest-sf.tgz')
    loa = list()
    for f in tqdm(dataset.getmembers()):
        if f.isfile():
            f_trajectory = dataset.extractfile(f)
            m = np.loadtxt(f_trajectory, skiprows=1)
            loa = loa + list(m) # add to list of arrays


  1. Werner, M. (2015). GISCUP 2015: Notes on Routing with Polygonal Constraints. SIGSPATIAL GIS CUP 15, in Conjunction with 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2015). [BibTeX]
  2. Werner, M., & Oliver, D. (2018). ACM SIGSPATIAL GIS Cup 2017 - Range Queries Under Fréchet Distance. ACM SIGSPATIAL Newsletter, To Appear. [BibTeX]

© 2020 M. Werner