Handling GEO Spatial data in Postgres with H3

Sudheer Gajula
6 min read2 days ago

--

Geospatial data has important role in location based service, which is used to match supply and demand. For example — requesting ride from cab hailing services or delivery of food , asset tracking etc. In this article, we shall focus on how spatial data can be used in postgres along with H3.

Spatial data is often represented using following data types :

Point — Point represents single location, where X refers to Latitude and Y represents to Longitude. Example of point is school, home, office etc.

LineString — LineString represents a 1Dimensional sequence of two points connecting. For example distance between PointA and PointB. LineString can be categorised into subsections like circular, compounded etc.

Polygon — Polygon is 2Dimensional surface, represeting sequence of points that defines a zone. For example — Area/District/State/Country can be represented as polygon.

PostGis in Action

PostGis and extension of postgres, supports handling spatial data. It extends all the postgres capability like ACID and Indexing. And also has bunch of functions to manipulate geometry data.

  1. Setup

Lets have postgres running through docker.

docker run --name postgres -e POSTGRES_PASSWORD=mypass -p 5433:5432 -d postgres
docker exec -it postgres psql -U postgres
CREATE EXTENSION postgis;
CREATE EXTENSION h3;
CREATE EXTENSION h3_postgis CASCADE;

2. Download and extract geo spatial data.

OpenStreetMap is a free, open map database updated and maintained by a community volunteers. You can navigate to OSM downloads page (https://download.openstreetmap.fr/extracts/) and download the desired region PBF(proto binary format) extract.

3. Loading data

Install osm2pgsql extension in your container and extract the pbf data and load them into database. These files are usually in few MBs to GBs.

docker cp karnataka.osm.pbf postgres:/
docker exec it postgres bash
apt install osm2pgsql
osm2pgsql -c -d temp -x -U postgres -E 4326 karnataka.osm.pbf

Once data is loaded into database, this will create 3 important tables having data about points, lines and polygons: planet_osm_line, planet_osm_point, planet_osm_polygon.

temp=# \d
List of relations
Schema | Name | Type | Owner
--------+--------------------+-------+----------
public | geography_columns | view | postgres
public | geometry_columns | view | postgres
public | planet_osm_line | table | postgres
public | planet_osm_point | table | postgres
public | planet_osm_polygon | table | postgres
public | planet_osm_roads | table | postgres
public | spatial_ref_sys | table | postgres

Basic Functions of PostGIS:

  • ST_X(point) returns the X co-ordinate, Lat
  • ST_Y(point) returns the Y ordinate, Long
  • ST_Length(linestring) length of the linestring between 2 points
  • ST_StartPoint(geometry) returns the first coordinate as a point
  • ST_EndPoint(geometry) returns the last coordinate as a point
  • ST_AsGeoJSON(geometry) returns JSON text
  • ST_DWithin(geometry A, geometry B, radius): Returns true if the geometries are within the specified distance (radius) of one another.

Few example queries:

-- Get X and Y co-ordinates
select ST_X(ST_Transform(way, 4326)), ST_Y(ST_Transform(way, 4326)), h3_index from planet_osm_point where name like '%ITPL%';

-- Get list of area name within Ploygon
select name from planet_osm_polygon where ST_Within('0101000020E6100000C118366FC16B53406C89A6FD20002A40', way);

-- Get list of area name intersecting with co-ordinates
select name from planet_osm_polygon where ST_Intersects('0101000020E6100000C118366FC16B53406C89A6FD20002A40', way);select name from planet_osm_polygon where ST_Within('0101000020E6100000C118366FC16B53406C89A6FD20002A40', way);

H3 — What are hexagons in GIS?

Hexagons are similar to traditional square grids used in GIS, hexagons provide uniform and efficient way to represent spatial data. Unlike squares or circles, they have a more uniform distance from centre to edge, making them ideal for to map boundaries with relative consistency across all directions.

This was introduced by Uber, H3 hexagon grid can be generated with hexagons on a defined size scale of 0 (largest) to 15 (smallest) depending on resolution. As we scale up, the number of hexagons used to fit in will be higher. For example — as we zoom into geography/map, the resolution changes to render it proeprly. You can try it here https://h3geo.org/#

Below is the table that we can refer to see, usually Level7 works best covering 5KMs radius which has 98Mil hexagons to represent entire planet.

Lets add H3 index to represent each point by resolution 7. This converts every point to an H3 Index.

h3_lat_lng_to_cell(way::POINT, 7)

ALTER TABLE planet_osm_trees ADD COLUMN h3_index h3index G
ENERATED ALWAYS AS (h3_lat_lng_to_cell(way::POINT, 7)) STORED;

Now, lets say when we want to book a cab and we would like to identify desired area/polygon which cover certain distance. From above it covers 5KMs radius for Level7 resolution.

Proximity Search

Proximity search allows users to retrieve results based on the spatial or relational closeness of data points. It is used to find locations or objects within a specific physical distance from a reference point.

  • Cab hailing — Identify the area of user while booking a cab to match user to driver.
  • Food delivery — List of restaurants that can deliver food to location.
  • Visualisation — Identify densely populated areas/ forests etc.
  • Shipment tracking — Tracking location of delivery etc

Let’s use current coordinates to identify the region and based on which we can run proximity search to match closest points. We shall use H3 lib to calculate H3 index at resolution 7 for current coordinates.

H3Core h3 = H3Core.newInstance();
double lat = 12.9913888;
double lng = 77.6823877;
int res = 7;
String hexAddr = h3.latLngToCellAddress(lat, lng, res);
//8761892e0ffffff

Attributes to watch: cellArea, resolution, and h3Index.

In order to identify list of places from the centroid of a location we can query postgis with function ST_DWithin(geo, geo, radius):

SELECT osm_id, name FROm planet_osm_polygon  
WHERE
ST_DWithin('0101000020E6100000E5FF9FD4286F53400BEB6B02B1F92940', way, 3000);

To check if specific location lies within a polygon:

SELECT name FROM planet_osm_polygon  
WHERE ST_Within('0101000020E6100000E5FF9FD4286F53400BEB6B02B1F92940', way);
name
-----------------------------------------------------
Bengaluru
Mahadevapura
Mahadevapura Zone
Bengaluru Urban
Bangalore East
Kaveri River Basin
Hudi
Bangalore Metropolitan Region Development Authority
South Western Railway
Karnataka

Get all the k-distance neighbours, identify all the nearest areas or neighbouring cells which helps in expanding search areas.

H3Core h3 = H3Core.newInstance();
double lat = 12.9913888;
double lng = 77.6823877;
int res = 7;
String hexAddr = h3.latLngToCellAddress(lat, lng, res);
System.out.println(hexAddr);
List<String> strings = h3.gridDisk(hexAddr, 1);

Output:
[8761892e0ffffff, 8761892e6ffffff, 8761892e2ffffff, 8761892e3ffffff, 8761892e1ffffff, 8761892e5ffffff, 8761892e4ffffff]

Conclusion:

PostGis being entirely free, it can be used easily installed and integrated to start analysing spatial data. There are also other databases like MongoDB, MySQL, SQL Server etc, that supports handing spatial data.

Also, PostGis indexes spatial data using R-Tree Index, which indexes data in optimised way for these data types and also many of these functions like ST_DWithin, ST_Intersects, ST_Touches and many more uses indexes efficiently to process queries.

ST_DWithin is very important for doing “within a distance” or “within a radius” style queries while still getting a performance boost from the indexs.

References :

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Sudheer Gajula
Sudheer Gajula

Written by Sudheer Gajula

Software Engineer. Passionate about distributed system, big data. LinkedIn: http://linkedin.com/in/sudgajula

No responses yet

Write a response