Database journalism

April 25 2009 11 Commented

A discussion with a commenter here on Window on the Media pushed me to write the database journalism article on Wikipedia.

Database journalism was born in the 1950’s as a synonym for computer-assisted reporting. Since then, computers have become ubiquitous, to the point that database journalism in its original meaning has come to merge with the very definition of journalism.

The developments of the internet have given database journalism a new definition, according to which it defines a process where the database becomes the center of the journalistic work (as opposed to the story in traditional journalism).

Go see. And do add your 2 cents about it.

11 Responses to “Database journalism”

  1. Interesting, and a good start indeed. As a bit of a critique, I think you are comingling a few distinct treads here that would be a bit more valuable if unraveled.

    1) The first wave if you want to call it that was more about bringing the tools of social science (statistics, in other words) to mainstream reporting than it was about hardcore database-driven research. There are many others out there better equipped than I to talk about this era — Phil Meyer, for example. It was a very small community back then because computer time was expensive, and quite a bit of knowledge was required to even get started.

    2) The second wave (which coincided with the advent of the cheap desktop PC in the late 1970s and 1980s) is characterized by the growth of what we now call “computer-assisted reporting.” I’m not sure everyone would agree with this, but to my mind the distinguishing characteristics of the second wave were that stories were often driven by the data analysis. The news organization, in other words, was its own source.

    The fact that data analysis was used was not hidden from the reader, and the news organization often went out of its way to provide a detailed methodology.

    Elliot Jaspin, David Burnham and Bill Dedman were among the first of this next wave of practitioners. The big boom in the states happened after Dedman won the Pulitzer for “The Color of Money” (I wrote about it recently here: http://www.aronpilhofer.com/posts/10). And through the 1990s, you saw a huge expansion of what we call CAR and the tools/techniques CAR folks used. GIS, data mining and even things like social network analysis became commonplace.

    From 1989 on, US newsroom significantly ramped up their CAR staffs, to the point where, today, almost every major news organization has someone trained in these techniques. You also saw an exportation of CAR to other countries — in particular Canada, Denmark and South Korea.

    News organizations in the UK have recently started to catch up, in large part because new FOI law has given reporters the ability to demand certain records in electronic format.

    3) The third wave of what you call database journalism (not sure I would, but I don’t exactly have a better label) is intrinsically tied to the internet as a platform for sharing data. You could mark the beginning point sometime around the early 2000s (perhaps 2004 or so, when Adrian Holovaty released chicagocrime.org). In the UK, MySociety predates chicagocrime by about a year, though I’m not sure they would consider themselves journalists (I would).

    There are a few things that distinguish this movement from the two previous ones: First, there’s the obvious introduction of the web as a tool for reporting and publication. Second, there’s a notion that computers can automate some of the roles of a journalist (chicagocrime, for example). Third, there’s the idea that data itself can be the story (or stories), that there’s not necessarily a need for traditional narrative-style reporting and analysis. Fourth, there’s a notion that the journalist is everyone: we should put the data out there and let the public tell us what’s interesting or newsworthy.

    Adrian was certainly one of the first, but there are others. Derek Willis (who works on my team now) was one of the first print journalists I know who made the switch to digital, leaving the Washington Post’s newsroom to work with Adrian on the digital side. The LA Times has had a team of folks doing this kind of work for a couple of years, as has the New York Times. Gannett made a major push at the corporate level a couple years ago to bring database-driven journalism to its newsrooms.

    On the nonprofit side, the Sunlight foundation has done amazing work, and MySociety continues to set the bar higher with sites like fix my street and theyworkforyou.

    And just a week ago, Matt Waite at the St. Pete Times was the first of this new breed of journalist/developer to share in a Pulitzer prize for the website he developed, politifact. It is the first Pulitzer ever awarded to a website.

    In the UK, you’ve seen the Guardian’s amazing efforts to expand the idea of what database journalism can be with their open platform, APIs and data visualization. The Telegraph has its “lab” and the BBC has done some very interesting and creative things with its APIs.

    The bottom line is that all three of these movements or waves share a common idea: that data and data analysis can reveal important trends that otherwise would go hidden. What differentiates them are the tools, the medium and the philosophy of who should play the role of journalist.

  2. Nicolas says:

    Aron,

    Thanks for your detailed analysis! I still think that points 1 & 2 go together and that the shift Holovaty wrote about has to do with journalism revolving around data-enhanced stories vs. journalism centered around data (that can be enhanced with stories).

    Anyway, I’ll try to update the Wikipedia entry as soon as I find the time!

  3. Yeah, I can see that: 1 & 2 have a lot more to do with one another than 3, but I would argue that they are all part of the same general movement.

  4. Abe says:

    Isn’t there already a Wikipedia article on computer-assisted reporting? These two should be one article.

  5. Nicolas says:

    @Abe and Aron:
    Changes done. Thanks for your insights!

  6. Abe says:

    I believe you have it backwards, subsuming the computer-assisted article as a subsection of “database journalism.”

  7. Abe says:

    What I mean to say is that this statement isn’t correct: “to the point that database journalism in its original meaning has come to merge with the very definition of journalism.” No, what we mean by computer-assisted reporting, or database journalism, is different from much else that would fall under the term journalism. You’ll need to work on that sentence…

  8. Nicolas says:

    Abe,

    I believe the sources in the article make it clear that CAR can be considered as any use of computer in journalism and that such use has become absolutely mainstream in most newsrooms. Now, if you want to challenge that, do so on Wikipedia with sourced statements.

  9. [...] genre d’initiative est au coeur du database journalism, qui met les données au coeur du processus [...]

  10. [...] depuis que je déffriche le champ du database journalisme, les portes se ferment dès que je demande des données [...]

Leave a Reply