December 20, 2016
Guest post by Kalev Leetaru
Today the Internet Archive announces a new interactive timeline visualization–the Television Explorer–that lets you trace how any keyword–think “emails”, “tax returns”, “alt-right”–has been covered on U.S. television news over the past half-decade.
Over the past year and a half, the GDELT Project and the Internet Archive’s Television News Archive have worked closely together to visualize how U.S. television news has covered the contentious 2016 political campaign.
One of the tools we created was the 2016 Candidate Television Tracker, which used closed captioning to count how many times each of the presidential candidates was mentioned on television and offered a day-by-day timeline showing the ebbs and flows of who was “winning” the free media wars. (Answer: President-elect Donald Trump.) This tool was used by such media outlets as The Atlantic, The Washington Post, FiveThirtyEight, Politico and The Guardian, among many others.
Now we are adapting this tool to allow more sophisticated searches: rather than just the presidential candidates, now you can trace television news coverage of any keyword of your choosing. You can even run advanced searches that find words in conjunction with other works or phrases, such as finding mentions of Hillary Clinton that also discuss her email server. All search results are available for download via CSV and JSON export, making it possible for data journalists, researchers, and advocates to fine tune their analysis of the data.
When searching, you get back a visual timeline showing how often that word or phrase has appeared on American television news over the past half-decade. Nearly two million hours of television news totaling more than 5.7 billion words from over 150 distinct stations spanning July 2009 to present (though not all stations were monitored for the entire period) are searchable in this interface.
Unlike the Internet Archive’s Television New Archive interface, which returns results at the level of an hour or half-hour “show,” the interface here reaches inside of those six and a half years of programming and breaks the more than one million shows into individual sentences and counts how many of those sentences contain your keyword of interest. Instead of reporting that CNN had 24 hour-long shows yesterday that mentioned Donald Trump one or more times, the interface here will count how many sentences uttered on CNN yesterday mentioned his name–a vastly more accurate metric for assessing media attention.
Explore how CNN covered the presidential campaign of 2012 versus 2016 and understand just how big of a media event this year’s election really was. See precisely when Edward Snowden burst onto the scene and how Wikileaks got more coverage during the 2016 presidential election than its debut in 2010. Watch the seasonal spikes of Thanksgiving, or see how ebola received little attention, even as thousands died in Africa, becoming a topic only after the first Americans became infected.
Using the “near” search feature, plot coverage of Wikileaks that also mentioned either “Podesta,” “email,” or “emails” nearby and discover that FOX paid far more attention to the DNC and Podesta email hacks than CNN, MSNBC, CNBC or Bloomberg. In contrast, CNN focused more intensely on the Trayvon Martin shooting (Aljazeera America and Bloomberg were not yet being monitored by the Archive), while Aljazeera led coverage of the Michael Brown and Eric Garner deaths.
Search for “ivory” to see that Aljazeera America (which ceased operation in April 2016) devoted vastly more of its coverage to elephant poaching in Africa than any other monitored national network. It also paid the most attention to “Africa” and to the “refugee” crisis. On the other hand, Bloomberg has devoted much more of its time to “China” and to the economic crisis in “Greece” last year.
We look forward to seeing what people do with this new tool Please share your favorite searches on Twitter with the hashtag “#internetarchivetvsearch”. If you have any questions, please email firstname.lastname@example.org or email@example.com.
Kalev Leetaru is an independent data journalist.