geonames_postcode usage¶
High-level interface¶
-
geonames_postcode.
valid_postcode
(country, postcode)[source]¶ Check validity of (country, postcode) combination.
>>> valid_postcode('DE', '85716') True
-
geonames_postcode.
valid_name
(country, name)[source]¶ Check validity of (country, name) combination.
>>> valid_name('DE', 'Unterschleißheim') True
-
geonames_postcode.
valid
(country, postcode_or_name)[source]¶ Check validity of (country, postcode_or_name) combination where postcode_or_name can be a postcode or a name.
>>> valid('DE', '85716') True >>> valid('DE', 'Unterschleißheim') True
-
geonames_postcode.
coordinates_postcode
(country, postcode)[source]¶ Get coordinates (latitude, longitude) of (country, postcode). Returns (None, None) when the postcode is invalid for country.
>>> coordinates_postcode('DE', '85716') (48.2804, 11.5768)
-
geonames_postcode.
coordinates_name
(country, name)[source]¶ Get coordinates (latitude, longitude) of (country, name). Returns (None, None) when the name is invalid for country.
>>> coordinates_name('DE', 'Unterschleißheim') (48.2804, 11.5768)
-
geonames_postcode.
coordinates
(country, postcode_or_name)[source]¶ Get coordinates (latitude, longitude) of (country, postcode_or_name). Returns (None, None) when postcode_or_name is neither a valid postcode nor name for country.
>>> coordinates('DE', '85716') (48.2804, 11.5768) >>> coordinates('DE', 'Unterschleißheim') (48.2804, 11.5768)
-
geonames_postcode.
regions
(country)[source]¶ Get regions of country.
The returned list is sorted alphabetically.
>>> regions('DE') ['Baden-Württemberg', 'Bayern', 'Berlin', 'Brandenburg', 'Bremen', 'Hamburg', 'Hessen', 'Mecklenburg-Vorpommern', 'Niedersachsen', 'Nordrhein-Westfalen', 'Rheinland-Pfalz', 'Saarland', 'Sachsen', 'Sachsen-Anhalt', 'Schleswig-Holstein', 'Thüringen']
-
geonames_postcode.
distance
(latitude1, longitude1, latitude2, longitude2)[source]¶ Calculates the distance in km between the two coordinates (latitude1, longitude1) and (latitude2, longitude2).
>>> distance(*coordinates('DE', 'Unterschleißheim'), *coordinates('DE', 'München')) 15.289746063637923
-
geonames_postcode.
postcode_names
(country, postcode)[source]¶ Get names of (country, postcode).
The returned list is sorted alphabetically.
>>> postcode_names('DE', '85716') ['Unterschleißheim']
-
geonames_postcode.
postcode_regions
(country, postcode)[source]¶ Get regions of (country, postcode).
Most of the time exactly one region is returned, but as postcodes and region boundaries do not match in some cases, several regions might be returned. The returned list is sorted alphabetically.
>>> postcode_regions('DE', '85716') ['Bayern']
-
geonames_postcode.
name_postcodes
(country, name)[source]¶ Get postcodes of (country, name).
>>> name_postcodes('DE', 'Unterschleißheim') ['85716']
-
geonames_postcode.
name_autocomplete
(country, name_start, sort='size')[source]¶ Get names of (country, name_start).
Results are roughly sorted by size (by using the number of matching postcodes, largest first). You can also sort it alphabetically by sort=’alphabetical’.
>>> name_autocomplete('DE', 'Untersch') ['Unterschleißheim', 'Unterschneidheim', 'Unterschönau', 'Unterschwaningen']
-
geonames_postcode.
name_substitutes
(country, name, *substitutes)[source]¶ Add name substitutes for name in country.
>>> valid('DE', 'Frankfurt') False >>> name_substitutes('DE', 'Frankfurt am Main', 'Frankfurt') >>> valid('DE', 'Frankfurt') True
-
geonames_postcode.
nearby_postcodes
(country, latitude, longitude, dist)[source]¶ Get postcodes closer than dist (in km) from the given coordinate.
This is the preferred solution for a radius search in a database by using a filter with the SQL in operator.
>>> nearby_postcodes('DE', *coordinates('DE', 'Unterschleißheim'), 5) ['85386', '85716', '85764', '85778']
Internals¶
The High-level interface is basically just reading data from the following data structures:
-
geonames_postcode.
_regions
= {}¶ _regions[country] is a sorted list of regions. It is prefixed by an underscore to prevent a name clash with the
regions()
function. The later just returns the regions of a country as contained in _regions, but includes the boilerplate code toload()
the country if not yet loaded.
-
geonames_postcode.
postcodes
= {}¶ postcodes[country] is a mapping of postcodes to
postcode_item
.
-
class
geonames_postcode.
postcode_item
(names, regions, latitude, longitude)¶ Postcodes are mapped to postcode items.
-
class
geonames_postcode.
name_item
(postcodes, latitude, longitude)¶ Names are mapped to name items.
Those data is available on the module level after calling load()
.
It is identical to the data created from the geonames.org data (see
Preparing geonames.org data), except for the additional country mapping.
Preparing geonames.org data¶
Please respect the license of the geonames.org data (Creative Commons Attribution 4.0 License), see geonames.org for details.
Fetching the geonames.org databases and preparing it for use is where
the time consuming preparation of the postcode data takes place. This process
results in a Python module for each country containing the data for the
regions
, postcodes
, and names
. The fetch
function has a rather simple interface:
The fetch()
function can conveniently started by the
geonames_postcode_fetch
script created by setup.py
and pip
:
geonames_postcode_fetch DE
Note that the postcode data can also be fetched in-place (without script installation) by:
python -m geonames_postcode.fetch DE
The postcode data is downloaded as a zip file, cached (to not reload the postcode data from geonames.org multiple times), analyzed and translated in a python file named <country>.py in the data directory of the geonames_postcode package. It can than be loaded and used blazingly fast by geonames_postcode.
Fetching and preparing the geonames_postcode data is subject to some
configuration done in the fetch.ini
file:
[DEFAULT]
skip_for_names=[]
postcode_remove_from=
max_distance=25
add_distance_per_item=1
name_postcode_chars=0
[CA]
skip_for_names=[{"name": "Bathurst", "latitude": 46.159900, "longitude": -65.810200}]
[CZ]
name_postcode_chars=6
[DK]
max_distance=50
[ES]
skip_for_names=[{"postcode": "15190", "name": "A Coruña"}]
[FI]
max_distance=30
[FR]
skip_for_names=[{"name": "Montaigut", "latitude": 45.612700, "longitude": 3.447800}]
[GB]
name_postcode_chars=4
[IT]
max_distance=30
[PT]
postcode_remove_from=-
[US]
skip_for_names=[{"name_start": "APO "}, {"name_start": "FPO "}]
max_distance=50
add_distance_per_item=2
[SE]
skip_for_names=[{"postcode": "917 01", "name": "Dorotea"}]
The configuration options are:
skip_for_names
- Skip some postcode data when building the names. The data to be skipped is
selected by
name
,name_start
(names starting with),latitude
,longitude
, and/orpostcode
. Several selectors are given by a list of dictionaries (i.e. the option is parsed as json). Within each dictionary the conditions are combined withand
. The items of the list are taken asor
conditions. postcode_remove_from
- Throw away details in the postcodes by removing all the tail starting at the given string.
max_distance
- Combine equal names for different postcodes if their distance is not further apart than given distance (in km) from the center of all values with equal names.
add_distance_per_item
- An additional distance to be added to
max_distance
for each postcode in the list of equal names. name_postcode_chars
- It turns out that the names alone are not necessarily enough to properly
distinguish between locations. The
fetch()
function thus tries to addregions
,sub-regions
, andsub-sub-regions
as given in the postcode data. However, this still might not be enough in all cases, and the postcodes might be taken into account up to the given number of characters.