Data Journalism: The hacks become the hackers

Data Journalism: The hacks become the hackers

How fast is LAFD where you live? An analysis by the Los Angeles Times Data Desk

Interactive map from an LA Times series on emergency response times

In the US state that supplies most of the world’s bourbon, I saw a possible future for journalism. In that future, reporters are as adept at crunching numbers and interrogating databases as they are at interviewing, and write code as much as they do copy.

I once dismissed the idea that a journalist needed to learn how to program as a waste of time. Then last summer I tried my hand at some basic web programming (HTML, CSS, JavaScript and jQuery) after attending Computer Assisted Reporting Bootcamp, a course put on by the Media Alliance in association with the US-based National Institute for Computer Assisted Reporting (NICAR). It proved valuable – I can now create simple tools such as interactive tables for The Australian Financial Review’s website – but I remained unconvinced.

But that was before I travelled to Louisville, Kentucky, for the 20th NICAR conference earlier this year. I was one of 600 journalists, programmers and designers who had gathered from 15 countries to exchange techniques and ideas over four days.

I alternated between feeling completely inspired and utterly overwhelmed, so it helped that I was in a state that claims to produce 95 per cent of the world’s bourbon.

The conference was host to the full spectrum of journalists, from reporters who had only ever used basic Excel commands through to those who spent their days programming. I heard from journalist after journalist who had learned how to code and been converted to the benefits of programming.

The 120-odd panels and courses ranged from general case studies and ideas for data-driven stories through to techniques and classes for advanced programming and system administration techniques.

In some classes, I could follow what was being said, or at least the basics, such as how to bulletproof your data analysis or how to build a timeline using JavaScript.

In others, I made do with merely learning of the existence of a technique or tool. The introduction to the programming language Ruby had me baffled within five minutes – but I could see how it could be used to extract data and tell stories.

The hands-on classes were packed, with attendees rushing to get one of the precious seats and even more precious power outlet for their laptop. The evenings saw the conference split up into like-minded groups.

Some attendees flocked together in the hotel lobby until the early hours of the morning, swapping programming techniques.

Other, much larger, groups explored the various bars of Louisville, where bartenders were only too keen to help them warm up in the sub-zero temperatures with different types of bourbon. My main recollections are that they treat bourbon like wine, and it is a good idea to let the ice melt a little before drinking.

The talk at the bars covered a much wider range of topics – and some of the journalists proved to be as excellent at karaoke as they were at programming.

Personally, I found the conference’s most useful aspect was swapping notes with other data journalists. The common theme was that often the easiest way to access technical skills was to learn them yourself.

During the conference, I learned about stories that simply would not have been possible without the journalists having some technical skills.

  • Ben Welsh and the data team at The Los Angeles Times were able to highlight deep-seated problems with the 911 emergency service by analysing millions of city records. http://www.latimes.com/news/local/lafddata/
  • Tasneem Raja and the data team at Mother Jones were able to show the rise in mass gun shootings by using primary research to build a database of every mass shooting over 20 years. http://www.motherjones.com/special-reports/2012/12/guns-in-america-mass-shootings
  • Ritchie King from business website Quartz used tax tables and some fancy programming to produce an elegant interactive calculator that allowed readers to compare how much tax they would have paid over time. http://qz.com/37639/check-your-us-tax-rate-for-2012-and-every-year-since-1913/
  • That is not the mention the data work done by The International Consortium of Investigative Journalists (ICIJ), where journalists from around the world analysed an unprecedented cache of 2.5 million files in order to expose the secret wealth of politicians and a dizzying array of prominent and not-so-prominent people. http://www.icij.org/offshore

So I have now been fully converted. This doesn’t mean I think that journalists will need to suddenly learn how to program overnight. But it would help to at least know the basic functions of a spreadsheet (sum, sort and filter) and how to calculate some basic descriptive statistics (mean, median and percentage change).

The Financial Times interactive producer, Martin Stabe, describes Excel as “a sort of gateway drug to more complex coding”.

On a more practical level, programming can also lead to lots of job opportunities, which are otherwise far and few between in journalism. A message board at the conference was filled with job ads for journalists with data-related skills.

“In the US, there aren’t enough qualified applicants to fill all the job openings for journalists with data analysis or programming skills,” says Mark Horvit, the executive director of Investigative Reporters & Editors, the organisation that manages NICAR.

“We’ve seen a steep increase in the number of positions being created for reporters and editors with these skills in the past two years. There are base technical skills people now need to have.”

Horvit emphasises a message that was repeated throughout the conference: the focus goal of any journalism data work should always be on the story.

“If you are journalist, you don’t want to crunch data and forget everything else you know,” he says. “[The data skills are] supposed to be in service of the story, not the other way around.”

Meet the people

…and find out what they’re doing

Ben Welsh

Database producer, Los Angeles Times

Job description

“Basically I try to turn databases into news and that can take a lot of different forms. It can be maps, it can be front page stories, it can be interactive graphics… the whole spectrum.” Welsh also develops news applications for the latimes.com website.

Key technical skills

Knows web development tools including HTML, CSS, JavaScript and JQuery; web application tools Python and Django; the database query language SQL; and mapping tools such as GeoDjango, Leaflet, PostGIS, QGIS and TileMill.

Background

Studied journalism and worked at NICAR (The National Institute for Computer-Assisted Reporting). Favourite piece of work

“Life on the line: 911 breakdowns at Los Angeles Fire Department”

In the last year, we did an interactive series about 911 response times in Los Angeles. We did data analysis that had impact and led to change. I liked the LAFD series because we were able to step up and cover a breaking local political issue in a more complete way thanks to data analysis, which connected the numbers to people and then to policy.”

http://www.latimes.com/news/local/lafddata/

Tasneem Raja

Interactive editor, Mother Jones

Job description

“I lead a team that focuses on creating interactive graphics and helping our reporters become better at data-driven investigations.”

Background

Was a features reporter and copy editor before crossing overto “the dark side”.

Key technical skills

Raja knows HTML, CSS, “a bit of Javascript”, front-end development, mapping programs such as TileMill and Google Fusion and project management.

Favourite work

“America Under the Gun”

“At Mother Jones we created a guide to mass shootings in America. It’s 20 years of data and it’s cited… as the only comprehensive collection of data. We were able to debunk quite a lot of myths. We were working on it in an iterative way for more than a year. We thought that mass gun shootings were on the rise but we didn’t have a way of testing that hypothesis, and the only way we were going to be able to do that was by creating a dataset.”

http://www.motherjones.com/special-reports/2012/12/guns-in-america-mass-shootings

 

Michael Keller

Senior data reporter, The Newsweek Daily Beast Company

Job description

Keller is a one-man data journalism team at his publication and describes his role as “pretty vast”.

“It covers computer assisted reporting, interactive design and programming. The job also includes a lot of phone calling. You can never get away from the standard tools of the trade.”

Background

Started off writing articles and stories. Is mostly self-taught.

Key technical skills

Knows web development tools such as HTML, CSS and JavaScript; the database query language SQL; the statistical program R; and mapping tools such as CartoDB, Leaflet, PostGIS and QGIS.

Favourite piece of work

“Roe v. Wade turns 40”

“We did one for the 40th anniversary of Roe v Wade which was the Supreme Court decision that legalised abortion [in the US]. We spent about six months creating our own database of every single abortion clinic in the country and then we called over 750 places to verify their addresses, that they were still open. We ended up doing up a map and a series of stories about access to services. Over the years there has been diminishing access and increased provisions and restrictions.”

http://www.thedailybeast.com/articles/2013/01/22/roe-v-wade-turns-40.html

Becca Aaronson

Healthcare and data reporter, The Texas Tribune

Job description

Aaronson is one of two data reporters at the Tribune, a publication that takes a particularly data-driven approach to its reporting. “I report on healthcare as a beat while developing data interactives. [I’m interested in] what will make some technical statistics come alive and be relatable to people or how can I make something within the data and take it further.”

Background

Started out as a reporter and, over a couple of years, has learned the basics of programming.

Technical skills

Web development tools such as HTML, CSS and JavaScript; database query language SQL; and mapping software QGIS.

Favourite piece of work

Interactive: “Economic Impact of Medicaid Expansion by Legislative District”

“It shows by legislative district how much money the expansion would bring in. [There is a] big breadth of information, you can see it quickly. It localises the information for you. I want [the map] to be useful for my reader.”

http://www.texastribune.org/library/data/impact-medicaid-expansion-legislative-district/

Ritchie King

Reporter, Quartz

Job description

King is one of two data reporters at the mobile-first business website site Quartz. “I’m a reporter who not only writes, but also makes charts, creates data visualisations, produces diagrams and illustrations, and communicates in other visual ways.”

Background

Was in engineering before journalism school and took classes on HTML and CSS and Javascript. Taught himself the rest or asked colleagues for help.

Technical skills

“When it comes to data gathering, I’m learning how to mine the internet with Python and recently wrote my first web-scraping script. When it comes to data analysis, I know both spreadsheet applications like Microsoft Excel and the statistical programming language, R, really well. When it comes to visualisation, I know both static design software such as Adobe Illustrator and also how to design directly for the web using HTML, CSS, and Javascript.”

Favourite project

“Check your US tax rate for 2012 and every year since 1913”

“At the time it was fairly timely because there was an active debate about raising tax rates. It’s also an evergreen piece – people are always interested in their taxes. There is an incredible amount of thought behind the interactive but you can’t tell. I felt I was able to distil it into a pretty straightforward [interactive].”

qz.com/37639/check-your-us-tax-rate-for-2012-and-every-year-since-1913/

Edmund Tadros is a data journalist for The Australian Financial Review. He is currently developing a training course on “data journalism” for the Walkley Foundation which will be delivered in the major capital cities this year. Keep up to date on future training courses at www.walkley.com/training

Brain Hamman

Deputy Editor, Interactive News, New York Times

Job description:

Hamman is part-manager of the team of 14 developers and designers who create the Times’ news applications and interactive features.

“I basically keep the trains running. My job is a lot of translating and working between disciplines. There is a lot of cross-department collaboration, making sure that things go smoothly. There’s an inherent tension between building good, technically solid software and the (needs) of a newsroom. So you have to balance this in the best way and (find) the expedient way to do things. Building software in a newsroom, everything is changing. You are laying things down tracks as the train is moving (on it) at different speeds.”

Technical skills

“Web application development using Ruby on Rails, Javascript, CSS, HTML and Node.js and the various libraries and frameworks available to each. Data analysis using MySQL and R, and managing Amazon Servers.”

Background

Hamman has both a technical and reporting background. While obtaining a Masters in Journalism, he was also working as a data analyst at the National Institute for Computer-Assisted Reporting. “I have long always had dual journalist and technical background. So we try to hire people who have the same background as me.”

Favourite piece of work

“The Oscars project is a great example of telling a story using technology. Our intention was to capture the narrative of the three-hour red carpet and awards show broadcast in a web interface. Our discussions felt more like planning a parallel television broadcast than building a piece of software.”

http://www.nytimes.com/projects/oscars/2013/index.html

Advice to beginners

“If you want to learn this stuff, you need to learn the technical skills and the best way is to find a project you are passionate about. You need to think like a project manager, you don’t just do your story.”

Martin Stabe

Interactive Producer, Financial Times

Job description:

“I’m a journalist who works on the interactive desk, a team within the newsroom with producers and developers. We do a lot of the digital things that have no print analog: podcasts, interactive graphics, blogs, social media. We have three producers, three developers, a social media editor and a blogs editor. We also work closely with three (full-time equivalent) interactive designers.”

Background:

“I’m a journalist first and foremost. I worked previously in business-to-business magazines, where if you can’t do it yourself, it doesn’t happen.”

Technical skills:

“I’ve been working with HTML and CSS since I was in school and built PHP-based websites as a student. But my role in the FT’s interactive team is primarily to do the ‘data-wrangling’ – sourcing, cleaning and organsing the databases that control our interactive graphics. Most of that can be done in Excel or SQL. I also work a lot with geographical information systems tools, particularly QGIS, to prepare geographical data for producing online maps.”

Favourite piece of work:

How fast is the London Fire Brigade?

“(This was) built using Tilemill, Leaflet and CartoDB following data cleaning and analysis in our MySQL database and the GIS application QGIS, this project has made the way we build online maps far more sophisticated. The result was the most granular responsible time map that has ever been produced for London. It was all done in only a few days after the London Fire Brigade warned us that they intended to release our FOIA request on the London Open Data site. The next step is the far more complex task of modelling the impact of the proposed changes.”

ft.com/firemap

Advice to beginners:

“Learn Excel. Spreadsheet forumulas are immediately useful, easy to learn and are a sort of gateway drug to more complex coding. You can achieve a lot in Excel by learning a few of the mathematical functions, but once you master more advanced functions, there are many things that will look very familiar when you move on to SQL or any scripting languages. If you understand the Excel VLOOKUP function for example, database joins will make sense to you, and Excel IF conditionals and the various stringmanipulation functions work much as they do in other languages.”

Edmund Tadros is a data journalist for The Australian Financial Review. He is currently developing a training course on “data journalism” for the Walkley Foundation which will be delivered in the major capital cities this year. Keep up to date on future training courses at www.walkley.com/training

First published in WALKLEY MAGAZINE #74

Leave a Reply

Your email address will not be published. Required fields are marked *