An introduction to newsroom programming technologies

Newsrooms around the country use code to expand their reporting, create alternative storytelling formats and engage audiences in new ways. Opportunities to enhance newsgathering and publishing with programming skills are significant and growing. So too are the calls to teach journalism students coding alongside writing, editing and reporting.

Many journalism schools recognize the value of technology training in their courses, but they face roadblocks when adding programming to their instruction. One fundamental challenge concerns what tools to focus on and when to teach them.

Journalists looking to improve their technical skill sets face similar issues. There’s no shortage of ways to learn code, but it may not be clear where to begin or how technologies fit together to make code-infused journalism possible.

To help address these issues, here’s a look at the most popular programming languages (and related tools) used by some of the newsrooms at the forefront of journalism and coding.

I’ve grouped the list into several thematic areas. These may lend themselves to modules in, say, a data journalism class. Or they may point to courses unto themselves.

Frontend technologies
Any news app, interactive infographic or other tool delivered on the Web is bound to make use of two foundational technologies: HTML and CSS. Combined with actual content, these tools give us everything we need to publish basic stories and graphics across devices. And they’re necessary in complex applications that require more technology.

  • HTML (Hypertext Markup Language). When the Web was invented, HTML was a founding technology. It remains an essential tool for publishing on the Web. It’s a means to add structure and semantic meaning to content. This makes it possible for browsers and other software (from search engines to screen readers) to make sense of what we’ve published. Though not strictly a programming language, it’s hard to avoid HTML when developing simple stories or sophisticated applications. And more sophisticated tools, from JavaScript to Ruby on Rails, integrate closely with HTML.
  • CSS (Cascading Style Sheets). Like HTML, CSS is a foundational technology with no substitutes. HTML gives content structure, and CSS designs it. Typography, color and layout are some of the presentation options CSS lets us define. Just as HTML has been revitalized with the emergence of a new, richer standard, CSS has benefited from ongoing development. It’s the primary tool designers use to implement responsive Web designs, making it a key ingredient when thinking about building content that works across devices.

JavaScript
JavaScript is another ubiquitous technology. Unlike HTML and CSS, it’s a full-fledged programming language with many sophisticated capabilities. On news sites, JavaScript usually adds a layer of interactivity to projects, making it possible for users to interact with complex interfaces. Many related technologies make JavaScript a valuable point of focus when learning to program for journalism.

  • jQuery. jQuery is one of the most popular JavaScript libraries. In programming, a library is a collection of prewritten code that solves common problems, accelerating the development process. In this case, jQuery makes JavaScript easier and faster to write and standardizes inconsistencies across browsers. jQuery focuses on making Web pages dynamic by manipulating the DOM, or document object model. This makes it possible to take part of a Web page and change it on the fly, for example, in response to a user action. This basic premise — changing part of a document after a user interacts — is the foundation of much of the interactivity seen on the Web today.
  • CoffeeScript. CoffeeScript is a different way to write JavaScript. It maps very closely to the JavaScript language, but it standardizes and simplifies some of the more tricky syntax. In the end, CoffeeScript compiles into JavaScript code, so it’s really about streamlining workflows. NPR uses this tool in some of its projects.
  • JSON. The JavaScript Object Notation is a standard for formatting and transmitting data from one site application to another. JSON makes it possible to represent data in a way that’s both highly structured (good for computers) and easy to read (good for humans). With the proliferation of APIs, or application programming interfaces, that allow systems to exchange information with one another, JSON has become a vital part of many news applications.

Data stores
Data stores are technologies used to archive information, usually in a highly structured way. That usually means many individual records, each with the same parts. Imagine a list that shows the names of organizations, for example, along with their locations, emails and phone numbers. Having a strong structure makes it possible to retrieve information in predictable ways. You could gather a list of all phone numbers for an organization, for instance, or all its other contact information.

What data stores do news organizations use?

  • CSV. Comma-separated value documents are a kind of plain-text data store. These files are easy to create and transmit, but they aren’t terribly well-suited for large or complex datasets. Sometimes, though, sources (government websites, for example) provide data only in this format, so it’s a necessary starting point.
  • Spreadsheets. Plain old spreadsheets can be surprisingly effective tools for capturing data and play an important role in data journalism.
  • MySQL. As an open-source relational database engine, MySQL integrates with some of the most popular content management systems, from WordPress to Drupal. In the case of WordPress, MySQL is the only database supported.
  • PostGreSQL. Postgres is another open-source, relational database management system. Its features and performance match MySQL closely. The Chicago Tribune uses Postgres for some of its projects.

Server-side programming languages
In the realm of newsroom coding, three languages have gained traction: Python, Ruby and PHP. Each is an object-oriented language and a good tool for making complex Web applications. Object-oriented programming emphasizes the use of classes, a kind of templating system. Classes allow code to be compartmentalized and reused, leading to speedier writing and easier maintenance–necessities when programing in the newsroom on deadline. These languages also share open-source licensing models. This makes it possible to deploy software without the need to secure rights or pay licensing fees.

Unlike some of the other technologies reviewed here, different server-side languages tend to be adopted by different development shops. In part, that’s because each language has its own quirks and strengths, along with a unique set of complementary tools. Sticking with one language streamlines development workflows.

  • Python. The Chicago Tribune, NPR and others use Python to power dynamic projects. Invented in the 1990s, Python has earned a reputation as a language that combines power and ease-of-use. It’s well-suited for a variety of tasks (on and off the Web), and it integrates easily with other technologies. Google uses Python for many of its projects.
  • Ruby. ProPublica uses Ruby for some of its projects. Ruby is, in many ways, comparable to Python, though some consider it more difficult to learn. For some problems, though, Ruby can provide more elegant solutions, and its block functionality is an often-cited advantage. Many websites run on Ruby — perhaps most notable, Twitter, although the social network has relied increasingly on Java to power its infrastructure.
  • PHP. Many of the biggest websites run on PHP, including Wikipedia and Facebook. PHP is also readily available — it comes preinstalled on most Web hosting accounts, making it one of the most accessible server-side languages. PHP integrates with other popular tools, including databases like MySQL and SQL Server and content management systems like Drupal and Joomla.

Server-side frameworks
Programmers use frameworks to make server-side languages easier to use and better-suited to the problems they need to solve. Different frameworks extend different languages, and some languages benefit from a group of frameworks, each with its own strengths. Here’s what’s popular in newsrooms.

  • Django.The Chicago Tribune and The New York Times are among the news organizations that use Django to deploy Web applications with Python. Among the technologies covered here, Django is unique in that it emerged from a newsroom. This makes it a great option for building dynamic news websites. Django uses a model-view-controller, or MVC, approach to building Web applications. MVC applications separate the ways data are stored (the model), displayed (the view) and manipulated (the controller) into logical subparts that are easy to mix and match.
  • Ruby on Rails. ProPublica uses Rails to streamline Ruby programming. Like Python, Ruby is a general-purpose language. It’s useful for solving all kinds of problems. The Rails framework makes Ruby particularly well-suited for Web development. Rails also employs an MVC approach to programming.
  • WordPress. Though not strictly a framework, WordPress (the “.org” version of the software — not the hosted tool that makes blogging a breeze) can be used to make PHP programming easier and more productive. As a content management system, WordPress offers an extensive API for creating and managing pages, blog posts and other kinds of content. Yuri Victor of The Washington Post has talked about why that organization uses WordPress.

Native mobile technologies
Some news organizations offer native mobile apps. These call on platform-specific technologies beyond the tools listed above. Objective-C is the programming language for iOS, the operating system that powers iPads and iPhones. Android devices make extensive use of the Java programming language. It’s also possible to build mobile apps using Web tools, and even so-called “hybrid” apps built that are built with Web technologies but deployed as native apps. (I’ve written about native and Web apps and what journalists need to know about the difference.)

Learning more
One of the best ways to learn about how newsrooms use code to enhance their journalism is to hear first-hand from newsroom developers. Fortunately, NPR, the Chicago Tribune, ProPublica and The New York Times all blog about their efforts to innovate digital journalism.

Along with learning how these teams build their projects, you can review the fruits of their efforts on GitHub, a website programmers (and others) use to store, share and maintain their work. ProPublica, NPR and The Chicago Tribune each maintain code repositories. You can use these to see the code behind some of their projects.

Related: How journalists can learn to code — and why it’s important | What journalists need to know about the power of code

We have made it easy to comment on posts, however we require civility and encourage full names to that end (first initial, last name is OK). Please read our guidelines here before commenting.

  • http://www.facebook.com/adam.walmsley.58 Adam Walmsley

    nice summary! im part of the codeavengers team providing free online interactive tutorials for HTML/CSS and JavaScript. Feel free to start learning to code at codeavengers.com.

  • cfrech

    This is a great point. Perhaps there is potential for a follow-up on GIS tools and techniques. Much to talk about on that front for sure.

  • jtjohnson

    Good round up, Casey, but you left out the whole of GIS programming. The degree, amount and skill levels vary from elementary to highly intricate and complex, but the GIS perspective should be front and center in every newsroom because it can be used for analysis AND to tell/show the story.

  • http://winpham.blogspot.com/ Win Pham