Adventures in reverse geocoding

Had some longitude & latitudes although in the biz they use them long, lat https://tools.ietf.org/html/rfc7946#section-3.1.1 (go figure)

The long, lat is in a redshift DB, so first order of business is getting the data out of redshift.

Luckily there is the handy unload command

Remember to escape your quotes, doubling the single ticks works

https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html
unload(
    'select * from venue where venuestate=''NV'''
) to 's3://yourbucket/test' 
credentials 'aws_iam_role=arn:aws:iam::yourrole' 
gzip
delimiter '|' 
parallel off;

Ok, now you have some data

Now we need to do some reverse geocoding.

One pattern to do this sort of transformation is AWS step functions that call Glue Python Shell Jobs.

Once you have the data in S3, write a python script to reverse geocode each entry. There are a number of online services that offer this.

Here.com has a very generous 250K Freemium account! https://developer.here.com/pricing

LocationIQ has a free 10K a day https://locationiq.com/pricing

http://geocode.xyz/ has a very interesting Reverse Geocode Lite https://aws.amazon.com/marketplace/pp/B07LB39JBB

Common to all of these services is being able to pass the API the lat, long and it brings back the country, city, state etc…

For example from geocode.xyz:

import requests

resp = requests.get("https://api.3geonames.org/51.4647,0.0079.json")
json = resp.json()
print(json['threegeonames'])
print(json['nearest']['city'])
print(json['nearest']['timezone'])
print(json['osmtags']['name'])
print(json['major']['city'])
print(json['major']['prov'])

Here.com has quite extensive documentation with examples.

https://developer.here.com/documentation/geocoder/dev_guide/topics/resource-reverse-geocode.html

With a response

So Parse the response, pulling out what data you need, and then load it back into Redshift. The step function looks like this.

Advertisement

One comment

  1. Hi Michael,

    Would like to suggest BigDataClous’s reverse geocoding API as well. Unlike other APIs, BigDataCloud has always free version and freemium version (50K queries per month).

    One of the unique aspects of the API is that it is the first to deliver administrative/non-administrative boundary-based results.

    We have written in more detail the comparison between various reverse Geocoding APIs, which might be insightful.

    https://www.bigdatacloud.com/blog/five-reasons-why-bigdatacloud-api-is-better-than-google-maps-free-reverse-geocoding-api

    Thanks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s