Introduction to open-source GIS tools for journalists

For years, it’s been only the most committed of journonerds who could invest both the time and money to put geography to use.

To say that’s changing is an understatement. GIS (Geography and Geographic Information Systems) is quickly becoming an integral part of how journalism is created and delivered.

Location is one of the key components of mobile, allowing information to be filtered based on location. It’s the foundation for pioneering concepts like EveryBlock and the reason most news organizations are enamored by geocoding news items. Geography is even a feature of Longitude, one of the inaugural projects of the New York Times’ beta620.

Luckily, as the need for geographic literacy has increased, digital cartography has exploded. Interactive maps and location-based services have unleashed a torrent of spatial tools throughout the past few years, making everything from analysis to sophisticated Web applications accessible.

There are a bevy of tools available, but here’s an introduction to those that make up my open-source GIS suite.

Getting your feet wet: QGIS

Journalists mostly use GIS systems to map and layer different data sets. Typically, these data sets come as “shapefiles” — geographic databases that contain points, lines or polygons, as well as information about each feature. It’s a fairly open standard that comes from a commercial background.

Even non-native GIS databases can be mapped through geocoding, whereby addresses are plotted on a map. Just a few years ago, the process was a headache that could last for days. Now, thanks to tools such as Google Fusion Tables and Refine, it’s a lot easier.

You can compare geocoding results, satellite imagery and shapefiles to one another to reveal all sorts of things: how lottery sales relate to poverty; how well a county’s tornado sirens cover the population; or that minority neighborhoods are more at risk of foreclosure.

ArcView, one of the most well-known proprietary systems, remains the industry standard for GIS. You probably even have an installation in your building — if not in the newsroom, then perhaps in circulation or advertising.

Some newsrooms worry that without ArcView, their spatial capabilities are limited. Fear not: QGIS has quietly come of age. Over the past few years it has become easier to use and far more powerful. It’s now a more than worthy replacement for the old ArcView standby.

There’s no dearth of resources for getting your feet wet with spatial analysis in QGIS. Here are a few links to get you started:

  • The QGIS wiki has walkthroughs and video tutorials for everything from opening shapefiles (vector files) to handling projections, a term I never thought I’d deal with outside of sixth-grade social studies.
  • Michelle Minkoff has done a number of posts on getting spatial records — like those managed with QGIS — online with tools from Google. Here’s one looking at intensity maps.

Stepping it up a notch: PostGIS

QGIS is great for analysis. But if you’re interested in giving your readers high-level functionality online, you’re going to need something with a bit more horsepower.

PostgreSQL is a database program. Like its peers, it lets users store and retrieve data, as well as ask questions about averages and group counts. And like its peers, it can serve as the backend for powerful websites that take advantage of reams and reams of information.

PostGIS is the spinach to Postgres’ Popeye. PostGIS adds spatial support to the usual database toolkit. Suddenly, instead of writing queries to find all records about a guy named Albert, you can ask for all records within 40 feet of Albert’s house. Or records within 40 miles. You can even join to unrelated tables to find, say, grocery stores in the same neighborhood as Albert. All within the friendly confines of traditional SQL. To find parcels within a quarter mile of a given point, for instance, you could run something like this:

<code>
SELECT * FROM parcels WHERE
ST_DWithin(geom,geomfromtext(‘POINT(-95.9375, 41.2586)), .25);
</code>

This is all well and good, but some people don’t like typing out commands. Besides, isn’t the best part of GIS actually seeing the maps you generate? QGIS can connect to a PostGIS database for viewing and editing, for absolutely free. It’s also going to give you an organizational boost: whereas a shapefile requires five to six various files to work right, PostGIS databases require only one.

Here are some additional resources to help you understand PostGIS:

Putting it online with OpenLayers

So, you’ve gotten your shapefiles from city hall, layered them together with some census information and pushed it all to PostGIS. You’ve got a great story to tell. Now you want to show your readers all your work, and let them mix and match it in new ways.

Enter OpenLayers, a javascript library that makes it relatively easy to create sophisticated maps. Because it’s javascript, ”installation” is a little more than pointing to some files from within a Web page. It also means it’s completely client-side, so your servers don’t have to do any of the heavy lifting.

A lot of pretty neat functionality comes baked in. OpenLayers allows you to use tiles from all sorts of different sources, including Google, VirtualEarth (Bing maps) and OpenStreetMap. It even allows you to create your own base layers. One especially useful tool is the LayerSwitcher, which allows you to switch base layers and features on and off. Here’s an example of how you could set up a simple map with a LayerSwitcher to utilize VirtualEarth and Google tiles:

<code>
<script type=”text/javascript”>
var map;
function init() {
map = new OpenLayers.Map(‘map’);
map.addControl(new OpenLayers.Control.LayerSwitcher());
var gmap = new OpenLayers.Layer.Google(
“Default google”, // the default
{numZoomLevels: 20}
);
var bing = new OpenLayers.Layer.VirtualEarth(
“Bing”,
{type: VEMapStyle.Shaded}
);
map.addLayers([gmap, bing]);
map.setCenter(new OpenLayers.LonLat(-95.9375, 41.2586), 9);
}
</script>
</code>

The best way to learn OpenLayers — as with most things — is to dive in and try it. The examples on the main page offer a great guide to all sorts of bells and whistles, and are pretty easy to pick apart.

Bringing it all together with Django

Django allows you to tie together all the powerful tools we’ve talked about. While a number of tools could get you to the same place, Django makes it easy.

Here’s a snippet to select parcels in a database near a certain location:

<code>
nearby = Parcel.objects.filter(geom__distance_lte=(parcel.geom, D(mi=.25)))
</code>

That single line of code tells my app to select all the parcels from PostGIS within a quarter mile of a certain property. Django then passes the results to OpenLayers, where we draw out all the parcels based on their coordinates within the database. The result is an interactive map with tons of depth and context.

This is only meant to be a brief introduction to some tools for working with maps. Digital cartography is an arena that continues to grow; there’s always something new to try and new boundaries to push.

What GIS tools do you find most effective?


This story is part of a new Poynter Hacks/Hackers series. Each week, we’ll feature a How To focused on what journalists can learn from emerging trends in technology and new tech tools.

We have made it easy to comment on posts, however we require civility and encourage full names to that end (first initial, last name is OK). Please read our guidelines here before commenting.