How to Do Shortest Path Analysis With Own Data Using PostGIS
- Published on
Article Number : 12
If you want to perform a shortest path analysis using your own data in situations where you cannot utilize the Routing API feature provided by Google and OSM services, this blog post aims to guide you through the process.
1- Spatial Data Quality and Topological Relation
We store roads, natural gas pipelines, water channels, fiber optic cables, power transmission lines, and similar objects, defined as geometric objects within the scope of Geographic Information Systems, in our database using the LineString geometry type.
The real data collected from the field is simplified and brought into a situation where spatial relations can be established with each other. In GIS, this relationship is called a topological relationship, which refers to the spatial connectivity and adjacency between these objects, and may vary according to different studies.
In network analyses, LineString geometries should not be related to any other same geometry table from anywhere except the first and last point. It is crucial to eliminate errors because such relationships prevent the topological relationship from being healthy. In the picture below, you’ll see a small part of a road line. Although it might be difficult to notice the topological error as the scale gets smaller in this picture, when you get closer to the connection points, it will become very easy to realize that the wrong topological relationship is established.
What it should be is the square in blue. The difference between the two examples is that while the objects in the blue box are connected to each other and snapped at the same point, in the red example, they are not fully aligned with the LineString.
Besides this, there is a second rule: The topological relationship segments that should be included in the network analysis should be inserted correctly. An A line may have other B lines connecting along its length. In this case, snap the endpoint or startpoint of line B to where line A snaps to line B. Line A then continues on its own way again. You can see an example of this in the picture below.
In the picture above, the green line represents the A line, while the blue lines are named as the B lines. I have referred to their fragmented lines as segments. The yellow dots on the graph represent the first and last vertex points of the lines. As you can see, to identify the error in this picture, there are no vertex points to be snapped in the area surrounded by the red box. To rectify such situations, you should bring the non-snapped ends together or even break the non-segmented linestrings and connect them to the ends of the other linestrings. You can find articles below where I explain how to resolve such errors using database queries:
1 – Splitting Geometries from Vertex Points with PostGIS
2 – Splitting Lines With Intersecting Points Using PostGIS
If you believe your existing data is suitable for network analysis, you can experiment with your own data. Additionally, if you don’t have any data for testing, I have provided a file for you to download. It includes geojson, kml, shp, and sql files, all named “Patika.” However, in this post, I will focus on teaching the more critical aspects using the patika.sql file.
2 – Installing PostGIS and PgRouting Plugins When installing PostgreSQL, it will prompt you to choose which add-ons you want to install. However, you might have skipped this step and proceeded with the installation. In such a case, the procedures you need to follow are very easy and straightforward. Simply run pgAdmin and open the Query Tool in the database where you want to add the geometry, and then execute the following SQL codes.
CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
CREATE EXTENSION pgrouting;
When you execute these codes, you should receive the result “Query returned successfully.” If PostGIS is already installed, it will inform you accordingly. Furthermore, if you encounter any issues with your plugins, you can refer to the PostGIS Installation and PgRouting Installation pages for detailed installation information.
3 – Adding Spatial Data to the Database
You can open the patika.sql file in the zip file I gave you with a text editor and browse the SQL codes in it. Using the GISLayer WebGIS Editor that I prepared while creating these files, I exported the geometries as sql files suitable for postgresql and shared them to you. If you wish, you can also use this site to create your own road data and download and add it to your database. If you want to learn how to do it, you can read it from this link or you can watch it on Youtube by clicking this link.
It represents the patika.sql file that I have shared to you with the code below.
- line, i create the sequence for the primary key
- created the patika db table between 2-4
- I added geometric column at 5. line
- In the 6th line, we create a spatial index. It is a feature that will speed up especially in large tables.
- All the lines below are SQL insert codes that help you insert the geometries in WKT format
CREATE SEQUENCE IF NOT EXISTS patika_SEQ START 1;
CREATE TABLE IF NOT EXISTS patika (
id BIGINT NOT NULL DEFAULT nextval('patika_SEQ'),
PRIMARY KEY(id));
SELECT AddGeometryColumn('patika', 'geom', 4326, 'LINESTRING', 2);
CREATE INDEX patika_SX ON patika USING gist (geom);
INSERT INTO patika(geom) VALUES
(ST_GeomFromText('LINESTRING(...WKT...)',4326)),
(ST_GeomFromText('LINESTRING(...WKT...)',4326)),
(ST_GeomFromText('LINESTRING(...WKT...)',4326)),
(ST_GeomFromText('LINESTRING(...WKT...)',4326));
4 – Defining Segments Node Numbers
At this stage, we need to create two new columns to assign node numbers based on the start and end points of the lines with a topological relationship. The following SQL codes will help you add the source and target columns to the path table and create an index. Please run these codes in the Query Tool to obtain the CREATE INDEX result.
ALTER TABLE patika ADD COLUMN source integer;
ALTER TABLE patika ADD COLUMN target integer;
CREATE INDEX patika_source_idx ON patika (source);
CREATE INDEX patika_target_idx ON patika (target);
5- Starting Topological Relation Analyzes
We need to establish the topological relationship between the lines using the pgr_createTopology function in PgRouting. The parameters of this function, in order, are the table name, tolerance (I recommend 0.0001), the name of the geometric column, and the primary key name of the table. Once all these are combined, a simple query like the following will be sufficient.
SELECT pgr_createTopology('patika', 0.0001, 'geom', 'id');
After running the code, you should see OK output. Also, a new table named patika_vertices_pgr will be created in your database based on your table name. If you don’t get this result, there is a issue with your topological relationship. You need to fix the topological problems and try again.
6 – Finding the Shortest Path Between Start and Finish Points
You are now ready to make queries. Using Dijkstra’s shortest path algorithm, we will write the SQL code that enables you to find the shortest path between two points with the pgr_dijkstra function. The view of our patika spatial table in the field is as follows. We will attempt to find the shortest path between the two yellow dots using the SQL code we will write.
Coordinate of the point in the green box: POINT(27.00232 39.23222)
Coordinate of the point in the red box: POINT(27.01509 39.23592)
We use the following query to find which linestring the yellow dot in the green box is close to
SELECT source FROM patika
ORDER BY ST_Distance(
ST_StartPoint(geom),
ST_GeomFromText('POINT(27.00232 39.23222)',4326),
true
) ASC
LIMIT 1
You can also do this for the red Point and find the closest line, but this will not be necessary. You can find all segments in order by entering the coordinate information of your first and last point together with the SQL code I will give below.
SELECT seq, node, edge, cost as cost, agg_cost, geom
FROM pgr_dijkstra(
'SELECT id, source, target, st_length(geom, true) as cost FROM patika',
(SELECT source FROM patika
ORDER BY ST_Distance(
ST_StartPoint(geom),
ST_GeomFromText('POINT(27.00232 39.23222)',4326),
true
) ASC
LIMIT 1),
(SELECT source FROM patika
ORDER BY ST_Distance(
ST_StartPoint(geom),
ST_GeomFromText('POINT(27.01509 39.23592)',4326),
true
) ASC
LIMIT 1),FALSE
) as pt
JOIN patika rd ON pt.edge = rd.id;
After running the code above, you will receive an output similar to the one shown below. To mention the columns in this output, respectively: seq represents the row number, node indicates the node number of the road where the node is located, edge denotes the node number of the road to reach, cost represents the length of the current road, agg_cost indicates the total path length at each step, and geom represents the geometry of your segments.
If you wish, you can run this query and produce a GeoJSON output by putting the SQL above where I wrote XXX below to get a full GeoJSON output
SELECT jsonb_build_object(
'type','FeatureCollection',
'features', jsonb_agg(features.feature)
)
FROM (
SELECT json_build_object(
'type', 'Feature',
'segment_id', a.seq,
'geometry', ST_AsGeoJSON(a.geom)::json,
'properties', json_build_object(
'segment_id', a.seq,
'node', a.node,
'edge', a.edge,
'cost', a.cost,
'agg_cost', a.agg_cost
)
) as feature
FROM (XXX) as a
) as features;
You can copy the GeoJSON output from this query and add it to a file named shortestpath.geojson and use it. I am sharing the file of the result I got from the query below.
download shortestpath.geojson file
If you open the GISLayer Web Editor and extract the file from the zip and drop it on the map with drag and drop method, the green line below will be displayed on the map. You can change these colors as you wish.
Now, as a result, the operation seems to be successful, but we cannot say that it gave a 100% correct result because some paths could not be linked to topology. However, we can say that it worked correctly.
I hope this post was useful for you.