How to use APIs from Twitter, Google & Facebook to find data, ideas

As more and more journalists are finding, APIs are a great way to get data for your Web applications and projects.

An API, or application programming interface, enables software programs to communicate with one another. (Chrys Wu wrote a helpful intro here.) To give you a better understanding of how they can help you, I’ve outlined some of the best APIs for finding content and explained how you can use open-source programming tools to glean information from them.

Twitter API

Twitter’s API is very well documented and has a lot of useful functionality. It’s especially useful for journalists who want to search Twitter for a term and either show or parse the results. Let’s take a look at how we can easily do that.

Here is some simple example code that searches Twitter for the term “earthquake” and then creates a bulleted list for the tweets that are found. You can copy and paste the code and replace the word “earthquake” with whatever term you want to search for.

If your development team uses open-source tools, there are some great libraries for parsing the Twitter API with much more advanced functionality than the snippet above. For PHP, use TwitterOAuth. For Python, use Tweepy. For Ruby, use Grackle.

Google Maps API

Google Maps has an extensive API with many different features. You can use it to build maps, geolocate tweets (or other pieces of data with latitude and longitude variables), search for local schools, or even measure elevation or distance. The API has a lot of documentation with several examples. Depending on what you’re trying to do, you can find code snippets to help you achieve your goal.

Here are a few good examples from the documentation:

There are also numerous wrappers for the Google Maps API in PHP, Python and Ruby, so consider reaching out to your development team for advice on how to integrate Google Maps with the apps they’re already building.

Facebook API

The Facebook Graph API is JSON-enabled and has search functionality similar to Twitter. You can search for who is posting publicly about topics, search Facebook Places, and see photos and videos posted publicly or by your page’s followers.

Facebook has a fairly extensive section of code examples that you can use and tweak to your liking. Some highlights are:

Other noteworthy APIs

The folks at Participatory Politics and Sunlight Labs have built a series of amazing APIs with access to government data — a notoriously tricky task. They have a new project called Open Government that aims to help users track numerous government data sets pertaining to politicians, bills, campaign donations and voting — all the way down to the local level.

Although they don’t have support for every state yet, Open Government is under active development, so more states will be added over time. The code they’re using is all open-source and built in Ruby.

Developers at the Chicago Tribune and The New York Times regularly blog and share their own APIs. Check out what projects they’re working on, and build and release your own.

Experimenting with APIs is just one of many ways to build your skills as a digital journalist. The more you know about the open-source culture, the more you can effectively share data, collaborate with others in the newsroom and, perhaps most importantly, tell innovative stories.

This story is part of a new Poynter Hacks/Hackers series. Each week, we’ll feature a How To focused on what journalists can learn from new tech tools and emerging trends in technology. Read more

1 Comment

Beginner’s guide for journalists who want to understand API documentation

There are three letters that have been floating around the media world for several years now: API. Short for “application programming interface,” an API enables software programs to communicate with one another, allowing your programs to share data and interact in a variety of ways.

There have been lots of articles about why it’s important for news outlets to have and use APIs.

To get the most out of an API, a conscientious creator will often produce a guide, called documentation or docs. There is no single standard for API documentation. The quality varies widely, from indexish and orderly, from pretty, to plain, to messy, to incomplete and nonexistent.

There aren’t many resources that explain API documentation to non-coders. And because the format isn’t standardized, it’s hard to write a one-size-fits-all guide to reading the manual. But assuming you’re dealing with a well-documented API, here’s an overview of how to figure it out.

The fundamental question: What can this API do for me?

Look for mentions of the word “requests.” If you don’t see that, look for the words “REST API,” or something that looks like the latter part of a URL.

Within those sections, look for the words “get” and “post.” These are called methods, the specific actions the API can do. (Some developers will quibble and call them functions. For this tutorial, we’ll stick to methods.)

If the documentation is written in plain English, it will be easy to understand what the method is doing. If not, you’ll need someone with more coding experience to help interpret what’s going on. But know this:

“Get” asks for something from the API server — as in, GET me the number of times an address shows up in the database.

“Post” changes the database by creating, adding or removing something from it — as in POST a new address to the database.

In what format can I get the data?

An API usually lets you choose how the data will come back to you, also known as the response format. You’ll usually see “json” or “XML.” Sometimes, you’ll see “txt” or other formats. The format is best decided by your developer, but at least you’ll know what’s available.

To find format options, search for the word “format” or “response.” Sometimes the format is mentioned at the start of documentation; sometimes, you’ll find “format” in the methods.

What does the API need in exchange for what I want?

Sometimes you can make a API request or post without identifying yourself. But API creators often want to know how the API is being used and by whom. In addition, they want to prevent server overload and head off developer hijinks, so many APIs require a key — an ID unique to the person or program making a request.

Getting a key is generally straightforward. Look for the word “authentication,” “API key” or “APIkey” to get the instructions, and to see which methods (which “gets” and “posts”) require authentication.

Can I test API requests even if I’m not a developer?

Yes. You can build your own test request by copying the example response found in the method and changing the variables, usually referred to as parameters.

For example, let’s try getting New York Times reviews for the “Harry Potter” movies as an XML-formatted response. Use your favorite search engine to find The New York Times movie reviews API. This API is not perfect (it’s in beta, after all). The steps below can be compressed with shortcuts once you become more experienced, but since we’re assuming this is your first time, we’re going to take the slow road.

Once you’re on the API page:

1) Look for something that allows you to get reviews using keywords. In this case, that’s the “Reviews by Keyword” method. Within the method description is a URI example (the text in the gray box). That’s the template for your request.

Copy it, paste it into a text editor [TextWrangler (Mac), TextMate (Mac) or TextPad (Windows)] and start replacing the parameters, the things in braces and brackets. They’re bolded below for easy reference.

In the Reviews by Keyword method, there are two required parameters: version which is the API version (use v2), and API-key, which you can get right here.

You’d go from this:

To this:
http://api.nytimes.com/svc/movies/v2/reviews/search[.response_format]?[optional-param1=value1]&[...]&api-key={paste your API key and here and delete the surrounding braces}

2) Next, set up two additional parameters, which are described a little further down in the same section of the Movie Reviews API documentation:

  • The response-format, which will be .xml
  • A keyword query — we’ll use query=Potter because searching for ‘Harry+Potter’ doesn’t work. (I know because I tried. Remember, the API is in beta.)
  • An opening-date range, from the first film (which came out in November 2001) through the last film (which comes out this week). As the documentation tells you, the format for a range is YYYY-MM-DD;YYYY-MM-DD, so we’ll use opening-date=2001-11-01;2011-07-31

Your URI example should now look like this (the new parameters are in bold):
http://api.nytimes.com/svc/movies/v2/reviews/search.xml?&query=Potter&opening-date=2001-11-01;2011-07-31&api-key={paste your API key and delete the surrounding French braces}

3) Copy and paste the URI you made into a Web browser address bar. Hit return.

If you made the changes correctly, you’ll get a response similar to what’s on this page. In fact, if you want, you can copy the URI above up to the = before the {, paste it into your browser’s address bar, and add your API key to the end and hit return to see the XML output.

Voilà. You’ve just made your first API call and pulled New York Times “Harry Potter” movie reviews. (Plus a straggler. Again, beta.)

Some API developers are nice enough to include a console, sandbox or fill-in-the-blank form so you can test your requests without hand-building them. Better yet, the tools usually generate both the properly formatted request and the result, which you and your developers can then copy and paste and use as you wish.

You will come across lots of documentation styles as you begin to explore what’s available to you. If you have questions about what you find, feel free to ask them on the Hacks/Hackers help board.

Chrys Wu is a journalist, strategist, coder and cook. When she’s not advising clients on user engagement and community building, she launches and organizes Hacks/Hackers meetups that bring journalists, developers and designers together to reboot news. She’s on Twitter @MacDiva. Read more


Zite incident shows why publishers need to enable automatic, controlled content distribution

In an era of free, frictionless content distribution, how can creators of that content be paid for their work?

The question was highlighted on Wednesday as 11 major media organizations — from Dow Jones Co. to Time — sent a letter to news aggregator Zite ordering the company to stop what the news outlets characterized as pervasive copyright infringement.

Zite pulls Web content from a wide variety of sites, reformats it, and displays it — without the ads — within its app. No one can argue about the infringement; Zite has already changed the way it presents the complainants’ content.

But presentation is not the reason consumers downloaded the iPad app 120,000 times in the first week. The real value of the app is its ability to predict which stories will appeal to each user.

For publishers, the problem is that Zite is really, really good at personalization and filtering. In my use of the app over the past few weeks, I’ve consistently found that the app shows me headlines I want to click on – and that’s the test that really matters.

We in media should think about what led us to this place, where major news outlets are targeting a company that is creating something they should create: an innovative, personalized news source.

What efforts have major media companies made to build or enable their own innovative news consumption products?

No product developed by a major (or minor) media company is as effective at Zite. Trove, from The Washington Post, is a work in progress when it comes to recommendations. Ongo, funded by a collection of media companies, is a product in search of an audience that wants pay for a limited collection of news sources.

And News.me, which is being developed by Betaworks with partial support from The New York Times, has yet to make it to the iTunes Store. (The Times’ Martin Nisenholtz criticized Zite in a speech on Monday, saying it scrapes and caches content in violation of copyright law, but the Times didn’t sign onto the cease and desist letter.)

Even Flipboard and Pulse — which are not products of major media companies anyway — are light on personalization so far.

The challenge media companies face is that they have so many fragmented distribution channels. Some, like full RSS feeds, contain entire articles. Others, like Facebook posts or e-mail newsletters, have just enough information in their headlines and summaries to satisfy some consumers. Add mobile sites and apps, Twitter feeds, YouTube channels, even Flickr galleries and Tumblrs, and you see all the different ways that publishers are setting their content loose on the Internet.

The good news is, this is exactly what consumers desire – news and information when and where it’s convenient. The bad news is, with such broad distribution it is tough to monetize content and even tougher to control its reuse.

Content creators must find a way to protect their property and spread it widely across multiple channels.

The New York Times’ Martin Nisenholtz focused on this topic on Monday in a speech to the Newspaper Association of America. Rather than content creators, “platforms win in Web 2.0,” Nisenholtz said — companies like Google and Facebook. Content companies need to create a “web of managed links.”

The Times’ new metered access plan is part of its reassertion of control over how its content is distributed.

But the key to success here is not in restricting access to content in order to increase its value; it’s exploiting the value inherent in wide distribution. The challenge is huge, but it is largely technical.

Media companies have three possible winning strategies:

  • Develop their own innovative apps
  • Collaborate with developers like Flipboard and Zite to display and monetize content
  • Implement robust application programming interfaces (APIs) that allow for controlled distribution of content for use on external sites and apps

In truth, none of those are perfect solutions. For the most part, what many consumers want (free, easy access to content) conflicts with the legacy business model of most news organizations.

So what about developing their own apps? The best aggregation effort so far is Ongo, which offers a limited number of news sources for a monthly subscription. It strikes me as a product-by-committee that is developed when established companies try to disrupt the disrupters. To create a truly compelling tablet app, a publisher will have to disrupt itself and probably have to annoy other news organizations in the process.

Working with external developers is a possibility. At least a dozen publishers are beta-testing the Flipboard Pages product, which shows some promise for repackaging digital content and monetizing it with full-page, interstitial ads.

Zite is more or less pursuing the same business model.

But if these companies intend to work closely with publishers, each of those partnerships requires lawyers, negotiations and contracts. Some publishers will demand different terms. All that negotiating gets in the way of innovation — in the way of building the product.

There’s an easier way to accomplish the same thing, although it’s also the furthest from reality right now: an open system that enables distribution and reuse as well as control and revenue sharing.

What publishers and developers need is a standard API that enables distribution of content for authorized purposes, monitors its use, offers standard advertising units and subscription requirements, and provides a way to share revenues.

The key here is that approval for “authorized use” would be automatic, contingent on standard terms of service. Mobile and Web developers would be able to pull stories, photos and video into their websites and app as long as the advertising (or other monetization tools) are presented in context.

The publisher would have the ability to limit the use of the API, from who can access it to how many items could be republished per hour. But the end result would be an app that looks like Flipboard or Zite, supported by a sustainable business model for publishers and developers.

To some extent this concept includes technology already in use: advertising networks, e-commerce systems, tracking and analytic tools and APIs.

Several major organizations have already created content APIs, including NPR, The Guardian and The New York Times. They provide access to everything from recipes to radio transcripts to congressional data.

Extending those tools to provide a foundation for content and advertising on mobile and tablet apps might be the best way to balance the two interests at tension within Zite: news companies’ need for revenue and control, and the public’s desire for news and information everywhere, all the time. Read more


Get the latest media news delivered to your inbox.

Select the newsletter(s) you'd like to receive: