Data Download

Put the Archive to Work

About the Dataset

The Political TV Ad Archive, powered by the Internet Archive, is pulling together resources from a variety of sources to create an archive of political TV ads in key primary states. Each ad has its own canonical web page (sample here), and associated downloadable metadata. Some metadata are added by the Internet Archive; some generated via the media itself (such as count of ads, how many times an ad has aired, etc.); and some come from our partners.

TV Recording

The Political TV Ad Archive gathered TV video in 20 markets across nine key states in the 2016 primary elections:

Iowa (Des Moines-Aimes; Cedar Rapids-Waterloo-Iowa City-Dubuque; Sioux City), New Hampshire (Boston-Manchester), Nevada (Las Vegas; Reno), South Carolina (Columbia; Greenville-Spartanburg), Colorado (Colorado-Springs-Pueblo; Denver), North Carolina (Charlotte; Raleigh-Durham-Fayetteville); Virginia (Roanoke-Lynchburg; Norfolk-Portsmouth-Newport News; Washington, DC-Hagerstown), Ohio (Cleveland-Akron-Canton; Cincinnati), and Florida (Tampa-St. Petersburg-Sarasota; Orlando_Daytona Beach-Melbourne; Miami-Ft. Lauderdale).

In addition, the project benefits from television collected for the Internet Archive’s TV news library, which includes the San Francisco; Washington, DC; and Philadelphia areas. Going forward, the project will continue to collect television in those cities as well as in New York City. Meanwhile, we are evaluating the project and fundraising to collect more political ads in key battleground states in the general elections.

Details of which channels are collected, when collection was started, and when stopped, if appropriate, are below:

started       stopped       location                 station name
2014-06-08    n/a           Philadelphia, PA, USA    WUVP
2014-06-06    n/a           Philadelphia, PA, USA    KYW
2014-06-06    n/a           Philadelphia, PA, USA    WTXF
2014-06-06    n/a           Philadelphia, PA, USA    WPVI
2014-06-06    n/a           Philadelphia, PA, USA    WCAU
2001-01-10    n/a           Philadelphia, PA, USA    NHK
2011-08-04    n/a           Philadelphia, PA, USA    RT
2010-07-15    n/a           Philadelphia, PA, USA    FRANCE24
2014-06-06    n/a           Philadelphia, PA, USA    WHYY
2014-09-03    n/a           Philadelphia, PA, USA    WPSG
2014-09-03    n/a           Philadelphia, PA, USA    WPHL
2001-02-12    n/a           Woodbridge, VA, USA      WRC
2001-02-12    n/a           Woodbridge, VA, USA      WJLA
2001-02-12    n/a           Woodbridge, VA, USA      WUSA
2001-02-12    n/a           Woodbridge, VA, USA      WTTG
2015-07-31    n/a           New York, NY, USA        WNBC
2015-07-30    n/a           New York, NY, USA        WCBS
2015-07-31    n/a           New York, NY, USA        WABC
2015-07-31    n/a           New York, NY, USA        WNYW
2015-10-02    2016-03-02    Boston, MA, USA          WFXT
2015-10-02    2016-03-02    Boston, MA, USA          WBZ
2015-10-02    2016-03-02    Boston, MA, USA          WCVB
2015-10-02    2016-03-02    Boston, MA, USA          WHDH
2015-09-30    2016-03-02    Boston, MA, USA          WMUR
2015-10-14    2016-03-02    Des Moines, IA, USA      KCCI
2015-10-14    2016-03-02    Des Moines, IA, USA      KDSM
2015-10-14    2016-03-02    Des Moines, IA, USA      WHO
2016-01-06    2016-03-02    Des Moines, IA, USA      WOI
2015-10-19    2016-03-02    Cedar Rapids, IA, USA    KCRG
2015-10-19    2016-03-02    Cedar Rapids, IA, USA    KFXA
2015-10-19    2016-03-02    Cedar Rapids, IA, USA    KGAN
2015-10-19    2016-03-02    Cedar Rapids, IA, USA    KWWL
2015-10-13    2016-03-02    Sioux City, IA, USA      KCAU
2015-10-13    2016-03-02    Sioux City, IA, USA      KMEG
2015-10-13    2016-03-02    Sioux City, IA, USA      KPTH
2015-10-13    2016-03-02    Sioux City, IA, USA      KTIV
2015-12-11    2016-03-02    Las Vegas, NV, USA       KSNV
2015-12-11    2016-03-02    Las Vegas, NV, USA       KVVU
2015-12-10    2016-03-02    Las Vegas, NV, USA       KLAS
2015-12-11    2016-03-02    Las Vegas, NV, USA       KTNV
2015-12-28    2016-03-02    Columbia, SC, USA        WIS
2015-12-28    2016-03-02    Columbia, SC, USA        WACH
2015-12-28    2016-03-02    Columbia, SC, USA        WLTX
2015-12-28    2016-03-02    Columbia, SC, USA        WOLO
2015-12-28    2016-03-02    Greenville, SC, USA      WYFF
2015-12-28    2016-03-02    Greenville, SC, USA      WHNS
2015-12-28    2016-03-02    Greenville, SC, USA      WSPA
2015-12-28    2016-03-02    Greenville, SC, USA      WLOS
2016-01-01    2016-03-02    Reno, NV, USA            KRNV
2016-01-01    2016-03-02    Reno, NV, USA            KRXI
2016-01-02    2016-03-02    Reno, NV, USA            KTVN
2016-01-01    2016-03-02    Reno, NV, USA            KOLO
2016-01-01    2016-03-09    Denver, CO, USA          KUSA
2016-01-01    2016-03-09    Denver, CO, USA          KDVR
2016-01-01    2016-03-09    Denver, CO, USA          KCNC
2016-01-01    2016-03-09    Denver, CO, USA          KMGH
2016-01-06    2016-03-23    Cleveland, OH, USA       WKYC
2016-01-06    2016-03-23    Cleveland, OH, USA       WJW
2016-01-06    2016-03-23    Cleveland, OH, USA       WOIO
2016-01-06    2016-03-23    Cleveland, OH, USA       WEWS
2016-01-06    2016-03-23    Cincinnati, OH, USA      WLWT
2016-01-06    2016-03-23    Cincinnati, OH, USA      WXIX
2016-01-06    2016-03-23    Cincinnati, OH, USA      WKRC
2016-01-06    2016-03-23    Cincinnati, OH, USA      WCPO
2016-01-06    2016-03-23    Orlando, FL, USA         WESH
2016-01-06    2016-03-23    Orlando, FL, USA         WOFL
2016-01-06    2016-03-23    Orlando, FL, USA         WKMG
2016-01-06    2016-03-23    Orlando, FL, USA         WFTV
2016-01-06    2016-03-23    Miami, FL, USA           WTVJ
2016-01-06    2016-03-23    Miami, FL, USA           WSVN
2016-01-06    2016-03-23    Miami, FL, USA           WFOR
2016-01-06    2016-03-23    Miami, FL, USA           WPLG
2016-01-06    2016-03-23    Tampa, FL, USA           WFLA
2016-01-06    2016-03-23    Tampa, FL, USA           WTVT
2016-01-06    2016-03-23    Tampa, FL, USA           WTSP
2016-01-06    2016-03-23    Tampa, FL, USA           WFTS
2016-01-06    2016-03-09    Norfolk, VA, USA         WAVY
2016-01-06    2016-03-09    Norfolk, VA, USA         WVBT
2016-01-06    2016-03-09    Norfolk, VA, USA         WTKR
2016-01-06    2016-03-09    Norfolk, VA, USA         WVEC
2016-01-13    2016-03-23    Raleigh, NC, USA         WNCN
2016-01-13    2016-03-23    Raleigh, NC, USA         WRAZ
2016-01-13    2016-03-23    Raleigh, NC, USA         WRAL
2016-01-13    2016-03-23    Raleigh, NC, USA         WTVD
2016-01-19    2016-03-09    Colorado Springs, CO USA KOAA
2016-01-19    2016-03-09    Colorado Springs, CO USA KXRM
2016-01-19    2016-03-09    Colorado Springs, CO USA KKTV
2016-01-19    2016-03-09    Colorado Springs, CO USA KRDO
2016-01-26    2016-03-01    Roanoke, VA, USA         WSLS
2016-01-26    2016-03-01    Roanoke, VA, USA         WFXR
2016-01-26    2016-03-01    Roanoke, VA, USA         WDBJ
2016-01-26    2016-03-01    Roanoke, VA, USA         WSET
2016-02-09    2016-03-23    Charlotte, NC, USA       WCNC
2016-02-09    2016-03-23    Charlotte, NC, USA       WJZY
2016-02-09    2016-03-23    Charlotte, NC, USA       WBTV
2016-02-09    2016-03-23    Charlotte, NC, USA       WSOC

Audio Fingerprinting

We are using audfprint technology to identify unique instances of political ads contained in streams of television. Developed by Dan Ellis at Columbia University, this tool can identify segments of identical audio by comparing audio “fingerprints.” This robust open source system is able to hear past added noise, adjust to time skews, and different encoding schemes. Known as the Duplitron, this tool, developed by senior engineer Dan Schultz, is also open source, and is available here. To read more about the technology behind this project, click here

For an introduction to the Political TV Ad Archive and how to use it, please check out this video.

Metadata

Metadata for this project may be downloaded in CSV format. There are two different downloads available; the first, “Download details of ad airings on TV,” provides details about airings of ads on TV, giving information about when and where they aired. The second, “Download list of unique ads archive,” provides information on every ad archived by the project, whether or not that ad has been captured as airing on television. (For example, the ad may appear on a social media channel, such as youtube or snapchat.)

 

Ad airings on TV metadata

wp_identifier = A unique numeric id for each ad identified, assigned by the Political TV Ad Archive project. Type: number.

network = TV channel on which the ad aired. Type: text.

location = Name of market area covered by broadcast.  Type: text.

program = Name of TV program in which ad aired. Type: text.

program type = “News” or “not news,” representing type of TV programming. Type: text.

start_time = Date/time ad aired, start. Note: these are  UTC times, or “coordinated universal time.” Converting to local times requires consulting local time zones with special attention to seasonal time changes.  Type: date/time.

end_time = Date/time ad aired, end. Note: these are  UTC times, or “coordinated universal time.”  Converting to local times requires consulting local time zones with special attention to seasonal time changes. Type: date/time.

archive_id = A unique alphanumeric id for each ad identified, corresponding with id used on PoliticalAdArchive.org. To see ad on website, add prefix: “http://politicaladarchive.org/ad/” to archive_id and a forward slash at the end and paste resulting url into browser. For example,  polad_berniesanders_f0chv becomes http://politicaladarchive.org/ad/polad_berniesanders_f0chv/ Type: text.

embed_url = Url for embedding ad. For embed code, use this id within this sample embed code: <iframe src=”https://archive.org/embed/PolAd_MarcoRubio_0py1v” width=”640″ height=”480″ frameborder=”0″ webkitallowfullscreen=”true” mozallowfullscreen=”true” allowfullscreen></iframe>  Type: text.

sponsor  = Ad sponsor, as it appears in the ad. Type: text.

sponsor type = Candidate committee, Super PAC, 501(c), 527 etc., source: the Center for Responsive Politics.) Type: text.

sponsor_affiliation = If applicable, the candidate associated with a particular sponsor. For example, Conservative Solutions PAC is a super PAC associated with Marco Rubio. Source: the Center for Responsive Politics. Type: text.

sponsor_affiliation_type If sponsor is associated with a particular candidate, whether it supports or opposes that candidate. For example, Conservative Solutions PAC is a super PAC that supports Marco Rubio. Source: the Center for Responsive Politics.  Type: text.

race = Pres, Senate, or House. The federal race the ad is targeted toward. For Senate and House, the state is also indicated, along with the district.  Source: the Center for Responsive Politics. Type=text.

cycle = Election cycle, i.e. 2016 = the 2015-2016 elections. Source: the Center for Responsive Politics. Type=text.

subject = Subjects covered in ad; subject index from PolitiFact, input by Internet Archive researchers. Type: text.

candidate = Candidate(s) named in ad; input by Internet Archive researchers. Type: text.

type = Campaign ad, issue ad, unknown, input by Internet Archive researchers. Most ads in this archive are “campaign ads”–ads that are targeted toward particular candidates. However, some ads are “issue ads,” ads that cover “a national legislative issue of public importance.” Federal Communications Commission (FCC) rules require that TV stations disclose ad buy contracts for both types of ads; therefore the Political TV Ad Archive includes such ads in this collection. Example: this ad on Puerto Rico debt. Type: text.

message = Pro, con, mixed; input by Internet Archive researchers. Pro = ad mentions one or more candidates in positive way, no negative message about any candidate (Important: this applies only to candidates running in current election and race). Example: this ad sponsored by Donald Trump’s candidate committee mentions only him and does so in a positive way. Con = ad mentions one or more candidates in negative way. Example: this ad sponsored by the Right to Rise super PAC, which supports Jeb Bush, mentions only Marco Rubio and in a negative fashion. It includes references to “liberal Democrats” but none are candidates in the 2016 presidential race. Mixed: Any ad that mentions more than one candidate in particular race, with significant positive content about one or more candidates and negative content about one or more candidates. Example: this ad, sponsored by the Right to Rise Super PAC, criticizes Rubio but praises Jeb Bush.  Type: text.

air_count = Total number of times this particular ad has aired, as captured by the Internet Archive in key primary states. Important: we capture all airings of ad, not just paid airings; if clip is replayed as part of a TV news broadcast, that will be represented in the count. Also, while this is a national total, it pertains only to the states the Internet Archive is tracking. A list of these states can be seen above. Type: number.

market_count = Total number of markets where this ad has aired, as captured by the Internet Archive. Important: we capture all airings of ad, not just paid airings; if clip is replayed as part of a TV news broadcast, that will be represented in the count. Also: this count refers only to markets tracked by Internet Archive. Type: number.

date_created = Date the Political TV Ad Archive counted first airing of this particular ad. Filter here to see what new ads the project has documented since viewing site previously. In other words, if  data are filtered for “February 10, 2016,” the viewer will see a list of ads first counted as airing on TV on that date; in this way, researchers can see which ads are new to the archive. Note: no dates exist before February 9, 2016, not because ads weren’t counted before then, but simply because that is the date this feature was added to the data. Type: date.

 

Unique ads archived metadata

In addition to detailing specific airings of ads on television, the Political TV Ad Archive also archives ads that are not showing up on television stations we are monitoring. There could be a number of reasons for this: for example, perhaps the ad is targeted for an online audience only; or perhaps the ad is airing in states or cities that the project is not monitoring. Another possibility is that the ad is airing on stations not captured by the project, such as local cable programs. Finally, perhaps an ad has not yet aired in key primary states that the project is tracking but will air in the future.

The metadata in this download are similar to those in the ad airings metadata download. However, there are a few additional elements:

reference_count number of fact or source checks from our partner organizations for this particular ad. For example, the claim that Donald Trump once supported impeaching former President George W. Bush, contained in this ad sponsored by Our Principles PAC, a super PAC opposing Trump, was fact checked by PolitiFact, which rated it as “True.” The PolitiFact story is embedded on the Political TV Ad Archive page displaying the ad. Type: number.

date_ingested = date this ad was added to the Political TV Ad Archive. Type: text.

transcript = Where available, the transcript for the ad. Type: text.