Downloading Who's On First

State

At rest every Who's On First record is an atomic GeoJSON-encoded text file. This decision addresses concerns about the portability and longevity of the project but can make it a challenge to get started. For that reason we provide a variety of pre-packaged distributions, in a number of formats, of Who's On First data organized by placetype and country.

Distributions

Distributions produced by the Who's On First project have been put on hold for the time being. Data distributions are currently sponsored by Geocode Earth:

https://geocode.earth/data/whosonfirst

Git(Hub)

All the Who's On First data and its complete edit history is available on GitHub:

How the Who's On First data repositories are organized

We use the Git version control system to manage Who's On First data. One of the present-day limits of Git is the number of atomic files you can store in a single Git "repository". We believe that eventually it will be possible to keep all 26 million records in a single Git repository but we are not there yet.

Instead of a single monolithic repository we have grouped Who's On First data as follows:

  • Administrative Data - all administrative placetypes (all the places between and inclusive of continents to microhoods) for the entire world
  • Everything else - all other placetypes including venues, postalcodes, constituencies and intersections

The naming convention for Who's On First data reposiories at its most granular looks like this:

	whosonfirst-data + "-" + WHOSONFIRST_PLACETYPE + "-" + WHOSONFIRST_COUNTRY + "-" + WHOSONFIRST_SUBDIVISION

There are some important points to keep in mind about these conventions:

  • All administrative data is stored in the whosonfirst-data repository. Administrative data is not subdivided by country or placetype.
  • We try to use the shortest name whenever possible. The only reason for subdiving a placetype by country (or a country by subdivision) is to address operational concerns of working with the data in Git or GitHub.
  • As a general rule every Who's On First document should have both an explicit wof:repo property containing the name of its parent repository.
  • Who's On First documents should also contain all the necessary properties to reconstruct its wof:repo name allowing developers to validate that name by testing for the presence of a matching repository, starting with the most granular name and working backwards.
  • The Who's On First data model allows for all of these repositories to be merged in to a single tree stucture without any collisions. If a Who's On First record is accidentally stored in or saved to an inappropriate repository that is considered an inconvenience (to be fixed) but not an error.

Administrative Data

Administrative data is located in the whosonfirst-data repository. This repository contains all administrative placetypes (all the places between and inclusive of continents to microhoods) for the entire world.

You can find data for the following placetypes in the whosonfirst-data repository: continent, empire, country, macroregion, region, macrocounty, county, locality, macrohood, neighbourhood, microhood

Other Data

Venues

There are over 20 million venues in Who's On First, with about 60% in the USA. Venues in the USA are grouped in to whosonfirst-data-venue-us-{WHOSONFIRST_SUBDIVISION} repositories, while everything is grouped in to whosonfirst-data-venue-{WHOSONFIRST_COUNTRY} repositories.

There is also a general purpose whosonfirst-data-venue repository which contains no data but pointers to all the venue-related repositories that do:

Postal Codes

Postal codes are grouped in to whosonfirst-data-postalcode-{WHOSONFIRST_COUNTRY} repositories.

There is also a general purpose whosonfirst-data-postalcode repository which contains no data but pointers to all the postal code -related repositories that do:

Constituencies

Constituencies are available for only a select number of countries (two to be exact: the USA and Canada) as we work through what it means to include constituencies in Who's On First. If you have constituencies from other countries we'd love to include them too.

Intersections

Intersections are a still-experimental placetype in Who's On First, currently only available for New York City (and specifically Manhattan).