You Might Not Need QGIS

QGIS is a great tool when you have to work with geospatial data, but when you only want to do one or two little tasks, like renaming or filtering some attributes then you might not need a heavy GUI based program. In this post I will introduce you to some GDAL (Geospatial Data Abstraction Library) command line tools that should help you with getting your work done more efficently.

ogrinfo

Ogrinfo displays basic information about geospatial datasets.

Getting meta information

To get a first overview of a dataset, I use the following command. It displays the layer name, geometry and all field names and their types. The parameter “-al” is for “all layers” and “-so” for “summary only”. Otherwise it would print all the geometries too.

ogrinfo example.shp -al -so
Getting information about the data itself

A great thing about ogrinfo is that you can use it in combination with OGR SQL. With OGR SQL you can do SQL queries on your data. After “FROM” you have to type the name of the layer you want to handle. Usually it’s just the file name but if you are not sure you can use the ogrinfo example.shp -al -so command to see the layer names.

Querying the min, max and average values of “field_a”
ogrinfo -ro -sql "SELECT MIN(field_a), MAX(field_a), AVG(field_a) FROM example" example.shp
Sorting by a “field_a”
ogrinfo -ro -sql "SELECT field_a, field_b FROM example ORDER BY field_a" example.shp

These are only some basic examples. You can also join other layers and create more complex queries to explore your data. Just check out the documentation.

ogr2ogr

The next tool I want to show you is called ogr2ogr. It helps you to process attributes and spatial data of multiple file formats. Here are some example tasks I often use when working with geo data.

Converting a shapefile to geojson

The last parameter is always the input file. The “-f” stands for the output “format_file”.

ogr2ogr -f 'Geojson' output.geojson input.shp
Selecting only some attributes

With ogr2ogr you can also use OGR SQL. The following tasks are good to clean up your data and structure it in the way you need it. Here we are only selecting “field_a” and “field_b” and write them to a new file called “output.shp”.

ogr2ogr -f "ESRI Shapefile" -sql "SELECT field_a, field_b FROM input" output.shp input.shp
Casting an attribute type

Sometimes when you work with a shapefile the type of a particular field is not correct. If you have an attribute where all values are numerical you can work better with the data if the type is an integer or float. Here we are casting the type of “field_b” to an integer and write the result to a new file output.shp.

ogr2ogr -f "ESRI Shapefile" -sql "SELECT field_a, CAST(field_b AS integer(3)) FROM input" output.shp input.shp
Renaming attributes

We all love clean data sets so we also should use proper names for our attributes. This task renames “FIELDA” to “field_a” and “FIELDB” to “field_b” with the help of the “AS” command.

ogr2ogr -f "ESRI Shapefile" -sql "SELECT FIELDA AS field_a, FIELDB AS field_b FROM input" output.shp input.shp
Filter attributes

In this example we create a new shapefile that only includes data sets where the attribute “name” starts with an “A”.

ogr2ogr -f "ESRI Shapefile" -sql "SELECT * FROM input WHERE name LIKE 'A%'" output.shp input.shp

This tasks selects all data sets where the attribute “id” is between 25 and 99:

ogr2ogr -f "ESRI Shapefile" -sql "SELECT * FROM input WHERE id BETWEEN 25 and 99" output.shp input.shp

Other CLI Tools

When I looked for some CLI tools to process geo data I also stumbled upon topojson and mapshaper which are both written in Javascript (there is also a ogr2ogr node wrapper). The problem I had with these tools was that they can’t handle larger files, because node always ran out of memory. So I would only recommend to work with these tools if you want to process smaller files.

comments powered by Disqus