Introduction to PostGIS: Part 1
Overview & Installation
PostGIS is a promising extension for the PostgreSQL database that adds spatial objects to a database. In simple words, PostGIS adds geographical capabilities to a database. With the growing importance of Location Based Systems (LBS) in our day-to-day life, PostGIS is a blessing in disguise to developers and architects building geo-aware applications.
PostGIS removes the trial-and-error based method of location based content storage, retrieval and search, by providing a concise and accurate back-end to perform such calculations.
Developers with a MySQL background would know how difficult it is to build and scale a location based database in MySQL. Although MySQL has some spatial extensions, they are far from offering the advantages and speed that PostGIS provides.
With the recent release of PostGIS 2.1 – the power of PostGIS has tripled, from support for Raster and Vector analysis in the database to support for 3D geometries (latitude, longitude and height/altitude).
In Part 1 of this tutorial, we will cover:
- Installing the prerequisite libraries for PostGIS.
- Setting up PostGIS on Linux.
- Creating your first spatial database – TestGIS.
- Importing data into TestGIS
In Part 2, we will go on to cover
- Connecting to TestGIS with PHP
- Connecting to TestGIS with Java
- Connecting to TestGIS with .Net
- Visualizing data from TestGIS on a Map
Installing the prerequisite libraries for PostGIS
PostGIS is a complex extension – if not configured properly, the database will not give the desired output. PostGIS relies on a couple of open source projects that are required to be installed before installing PostGIS.
The following are the required prerequisites for our PostGIS installation:
- PROJ4 4.8
- GEOS 3.3.3+
- GDAL 1.9
- PostgreSQL 9.1+
- LibXML2 or higher.
- JSON-C 0.9 or higher.
- GNU C Compiler and GNU Make – If installing from source.
Optional libraries that would be good to install are:
- GTK 2.0
- Apache Ant
We will use a VirtualBox appliance installed with the latest version of Ubuntu Linux and we will configure the same for our PostGIS 2.1 installation.
How to do it
Download and install VirtualBox from http://www.virtualbox.org/. Download the ISO of the latest version of Ubuntu Linux(14.04) and create a new VirtualBox appliance named Ubuntu and complete the installation of the Ubuntu operating system.
Within the Ubuntu OS, create a user named postgis with the password as postgis as well. Make sure the user has administrative permissions.
Next open up the Terminal and install the PROJ4 library by the following command
sudo apt-get install libproj-dev
Next install the GEOS library by
sudo apt-get install libgeos-3.4.2
The GDAL library is next
sudo apt-get install libgdal-dev
Now we install the PostgreSQL RDBMS 9.3 by
sudo apt-get install postgresql
sudo apt-get install postgresql-client
We also need the PostgreSQL server development packages, so install the same using
sudo apt-get install postgresql-server-dev-9.3
The JSON-C library is installed by
sudo apt-get install libjson0 libjson0-dev
It’s a good idea to install the above libraries from their respective sources, since the Ubuntu repositories may have an older version of the libraries. To compile the above libraries from source, make sure you have the GNU Compiler Collection (GCC) and GNU C++ compiler (G++) installed.
If you haven’t installed it yet, use the following command to do so.
sudo apt-get install gcc g++
To make sure we have the latest libraries, we download the sources for PROJ4 from http://trac.osgeo.org/proj/, GEOS at http://trac.osgeo.org/geos/ and GDAL from http://trac.osgeo.org/gdal/wiki/DownloadSource. To install the latest libraries, you will have to unzip the downloaded packages and run the following commands
sudo make install
How it Works
The OSGeo: Open Source Geospatial Foundation (osgeo.org) is a collaborative effort to bring all open source geospatial projects under one umbrella. It maintains the latest version and development versions of the leading geospatial libraries. It also provides legal, financial and infrastructural support to these projects. We use the latest versions of these libraries to make sure we have the latest build of the same, so that our PostGIS installation does not run into trouble.
It is a good idea to install pgAdmin3 – a GUI for managing PostgreSQL database; you can do so by
sudo apt-get install pgadmin3 on your terminal.
If you run into trouble installing the prerequisites as mentioned above, make sure you have the libxml2 (XML C Parser) library installed, if not already present on your Ubuntu distribution. Install the same if needed by
sudo apt-get install libxml2 libxml2-dev
Setting up PostGIS on Linux
PROJ, GEOS and GDAL are key libraries required for PostGIS to work. Beside these libraries, PostgreSQL 9.3 is the database that we will use to configure PostGIS. Starting with PostgreSQL 9.1 and PostGIS 2.0 there are two methods of adding the PostGIS library to PostgreSQL- by using Extensions or by using the PostGIS enabler scripts(.sql files). We will use the extensions method.
Download the latest source code of the PostGIS library from http://download.osgeo.org/postgis/source/postgis-2.1.2.tar.gz
How to do it
Open up your terminal and unzip the downloaded postgis source code using the
gunzip command, which will create a postgis-2.1.2.tar file.
Extract the content of the tar file by using the
tar –xf postgis-2.1.2.tar
Now change to the postgis-2.1.2 directory and run the following commands to install the postgis library on your system.
sudo make install
If everything works perfectly fine, you should see the following two lines at the end of the
make install command
Leaving directory /home/postgis/postgis-2.1.2/extensions/postgis/topology
Leaving directory /home/postgis/postgis-2.1.2/extensions
Update your PATH to include the PostGIS command line programs like
shp2pgsql, under Linux edit your .bashrc file and append the following line
If you are on a different operating system, make sure you download the right files, some good developers have released the un-official RPMS, DEB files and other repositories for PostGIS 2.1 installation, since it takes a while for the official distribution channel to update the versioning of packages, based on their own release schedule and testing.
The latest stable version of PostGIS in the Ubuntu 14.04 repository is postgis-2.1, this can be installed with a single
sudo apt-get install postgresql-9.3-postgis-2.1 command.
Creating your first spatial database – TestGIS
With PostGIS 2.1 configured on our Ubuntu distribution, let’s create a new PostGIS-enabled database called TestGIS. We will use this database throughout the tutorial to depict a real world usage scenario for the readers’ understanding.
pgAdmin3 is a good GUI tool to work with the PostgreSQL server. If you haven’t installed it yet, now is a good time to do so. However it is not a prerequisite, we can work our way through with just the terminal prompt.
Before we create a PostGIS database, we need to modify the default pg_hba.conf configuration file for our PostgreSQL server so that it accepts connections from our localhost and/or remote hosts (as needed).
Open your pg_hba.conf (on our Ubuntu config the file is located at /etc/postgresql/9.3/main/ and navigate to the Client Authentication lines as shown below and change the METHOD to trust for all local connections.
Restart the PostgreSQL server by issuing the following command on the terminal
sudo service postgresql restart
Next add a new PostgreSQL user named postgis with the password postgis by using the
createuser interactive command.
sudo createuser postgis –Upostgres –P
Enter password for new role:
Enter it again:
Shall the new role be a superuser? (y/n) y
You can confirm the newly created user by visiting the Webmin GUI package on your local browser. Webmin is a web based system administrator tool for Linux – no more searching for configuration files, Webmin provides all services within one rich GUI. Download it from http://www.webmin.com/. Here is our screen grab from the PostgreSQL module that shows our newly created user as an entry.
How to do it
The first step to creating our first PostGIS database is to create a regular PostgreSQL database using the
createdb command. On your terminal fire the following command to create a new database named TestGIS.
createdb TestGIS -Upostgis -W
Enter the password postgis when asked by the command prompt.
Now from the PostgreSQL console(psql) select the newly created created TestGIS database by typing the following command
psql -U postgis -W TestGIS
Next, enable the postgis extension for the TestGIS database with
TestGIS=# CREATE EXTENSION postgis;
We can use pgAdmin3 to verify our newly created TestGIS database, open the pgAdmin3 tool and create a new server with username and password as “postgis” and port set to 5432 (default PostgreSQL port). Name the connection as TestGIS. See the screenshot below for an example.
Once you click OK, expand the tree by clicking on the Databases->TestGIS node and verify that postgis is listed as an extension, as shown in the figure below.
How it works
PostgreSQL 9.1 and higher includes the
CREATE EXTENSION directive that allows us to load a new extension into the current database with a simple command. PostGIS 2.1 supports this mechanism, so when we used
CREATE EXTENSION postgis on our TestGIS database, the system imported the geometry, geography, raster, spatialrefsys and everything related into our database. Similarly the topology feature in PostGIS 2.1 is another extension that can be loaded by
CREATE EXTENSION postgis_topology.
Loading an extension is equal to running the extensions .sql file. Postgresonline.com has a great article on extensions at http://www.postgresonline.com/journal/archives/216-PostgreSQL-9.1-Exploring-Extensions.html visit the link to know more about extensions.
Importing data into TestGIS
Having created our first postgis-enabled database, it is time now to import some real-world data. When we say real-world we do mean real world data. We source this information from the Yahoo’s Flickr Geo API Explorer and specifically this link containing the geographical boundary for United States – http://www.flickr.com/places/info/23424977
Notice the link to the Shape file (enclosed in the rectangle in the following screenshot). Shapefile is an open source file format to store geospatial data. This format has been developed and regulated by ESRI – Environmental Systems Research Institute (http://www.esri.com/).
To store the country geographical boundary, we define a country table in the TestGIS database with the following SQL statements:
psql –Upostgis –W TestGIS
TestGIS=# CREATE TABLE country(id integer NOT NULL, name text, boundary geography(MultiPolygon,4326),name_alias text, CONSTRAINT "countryId" PRIMARY KEY (id ));
Note the datatype of the column named ‘boundary’ which is not your traditional database column of type int, varchar or datetime, but a geography datatype. This column can store a multipolygon data (as also seen in the screenshot above). The 4326 number is the SRID (Spatial Reference System Identifier) corresponding to the WGS-84 (World Geodetic System). The mapping for the same can be inspected in the
spatial_ref_sys table in our TestGIS database.
Note that the WGS-84 is longitude first and then latitude, so when working with data ensure you store and retrieve data in lon/lat pairs and not lat/lon pairs.
How to do it
Fire up your terminal again and run the
shp2pgsql command on your shape file to convert the shape file to SQL, and then selectively insert the converted data into the country table.
Navigate to the folder where we downloaded the Shape file, in our case it is
Now run the
shp2pgsql command by
postgis@postgis:~/Downloads/Geo-Data/country/US$ shp2pgsql -G 23424977.shp > US.sql
Open the US.sql file so generated, it should look like below.
From the file, copy the first insert statement, namely
INSERT INTO "23424977" ("tag","count","area","perimeter","density",geog)
We just need the values from the last column named ‘density’, which is basically just one Polygon from the US boundary shape file we downloaded before. Using this raw text we construct our insert statement as
INSERT INTO country(id,name,boundary,name\_alias) VALUES(1,'United
Save this file as insert.sql and import the same in your TestGIS database.
postgis@postgis:~/Downloads/Geo-Data/country/US\$ psql -Upostgis -W TestGIS < insert.sql
To verify the data open the pgAdmin3 GUI and run the following query in the SQL Editor
SELECT id, name, ST_asText(boundary), name_alias FROM country;
You should see the following output.
How it works
shp2pgsql command bundled with PostGIS 2.0 offers an easy way to import geospatial data from ESRI Shapefiles. There is a GUI for the command as well, named shp2pgsql-gui that allows import/export of shape file data. There are libraries and programs to read Shapefiles in different languages and platforms: visit http://en.wikipedia.org/wiki/Shapefile or your friendly neighborhood search engine for more information.
Another better way to get the country boundaries from the Flickr service is via the API, on the same United States page on the Flickr Geo API Explorer click on the
flickr.places.getInfo box’s View in API explorer, the raw API link is http://api.flickr.com/services/rest/?method=flickr.places.getInfo&api_key=907997443d2d8f1f06aa5073c4d9dfeb&woe_id=23424977, which can be parsed in any popular programming language and the data inserted into our country table.
In the previous example, we learnt to import Shapefile data into our PostGIS database. However it was a manual copy-paste job and it did not import all the polygons needed for the full USA boundaries. Moreover if we need to automate this process and want to import all the countries of the world into our database, then it becomes a herculean task to do in the copy-paste fashion.
To resolve this, we use the PostgreSQL functions for PHP, JDBC for Java and Npgsql for .Net.
In Part 2 of this tutorial, will will step through how to connect with TestGIS using PHP, Java and .NET, respectively.
Author: Zeeshan Chawdhary