The Excel file below includes the names and a variety of characteristics of every identifiable witness that testified before the House of Representatives between 1971 and 2016 (the 92nd to 114th congresses). The general contours of our dataset were derived from a raw .txt file scraped from ProQuest’s Congressional Hearings database by Maher et al. (2020), who kindly shared it with us. From the raw data file, we then used R to extract the hearings information, to unnest the list of witnesses from each of the hearings contained in the dataset, and to convert the .txt file into an Excel spreadsheet containing 435,293 witnesses who appeared in the U.S. House of Representatives between 1971 and 2016.
The unit of analysis for this dataset is the individual congressional witness. For each of these witnesses, the data also included the date of the hearing, hearing title, congressional committee holding the hearing, Congressional Information Service (CIS) hearing identification number, and the subject matter of the hearing. To the list of fields noted above that were pulled directly from the scraped .txt file, we added columns for the committee type, using the Deering and Smith (1997) typology (e.g. Policy, Prestige, Constituency, and Undesired), and witness type, using an adapted version of the Burstein and Hirsch (2007) typology of congressional witnesses. (We detail our coding system for witness type in the accompanying pdf file, below.) We also added dummy variables for unified vs. divided government, party control, and for whether the hearing took place prior, during, or after the pivotal 104th congress.
We exclude from this dataset cases in which the witnesses identification could not be discerned.
To cite this dataset, please use the following format:
Bell, Lauren C. and John D. Rackey [date retrieved]. U.S. House of Representatives Witness Database, 1971-2016. Retrieved from https://www.rmc.edu/departments/political-science/faculty/lauren-bell/dataset-information.