“It all begins today!” The Collected Tweets of Donald Trump (20 January 2017-08 January 2021)

tl;dr we share a repository including the tweets published by the realdonaldtrump account between 20 January 2017 and 08 January 2021.

If you have followed this blog for a while you know that my interest in the text and data analysis of Trump’s tweetage is not new. See for example this, this, or this, or even this. There’s more if you follow the links.

Needless to say my interest on this tweetage comes from the basis that access to sources and evidence is key to understanding situations, and in this case key to resisting and critiquing white supremacy, fascism, sexism, totalitarianism, etc. This is not content one reads for pleasure or enjoyment, even less agreeing with it. Pretending Trump and his tweets never happened will not necessarily help us, in my opinion, to avoid it all happening again under similar or different guises. Think of it as a way of documenting a type of behaviour we want to learn from and denounce in order to resist it better in the future.

Needless to say today is a very important day in the history of the United States and we should say the rest of the world.

To celebrate this Inauguration Day I am now sharing a repostitory of source data and supplemental materials that were created initially for a project I supervised by former PG data science student Carlota Barata.

The initial project originally focused on Trump’s tweeting during 2020 regarding the pandemic. Now that Carlota has successfully graduated, and given the events that have continued to unfold since this Fall, we have extended the scope of the project to complete and share a wider dataset that covers Trump’s tenure as President, including the text, source and timestamp metadata we both collected of the tweets published by the realdonaldtrump account between 20 January 2017 and 08 January 2021. It must be noted of course that in Friday 08 January 2021, Twitter permanently suspended the @realDonaldTrump account.

Priego, E., & Barata, C. (2021, January 20). Repository: “It all begins today!” Archiving the Tweets Published by the realdonaldtrump Twitter Account 20 January 2017-08 January 2021. Retrieved from osf.io/qhpba

For more relevant  information about how we collected and presented the data please see the README file included in the repository.

We are categorically not the first to collect this data or make it available. In spite of Twitter’s suspension of the account, the relevant data remains publicly available in different formats, sources and levels of completion and tidiness. For comparison and robustness, our repository also includes a compressed file in comma separated values format as directly exported from https://www.thetrumparchive.com/ (last updated 01/08/2020). For more information on this version of the data please see https://www.thetrumparchive.com/faq.

A plethora of scholarly literature is already available providing direct access to Trump tweets datasets (to name just one example, .cfr Clarke I, Grieve J (2019) Stylistic variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and 2018. PLoS ONE 14(9): e0222062. https://doi.org/10.1371/journal.pone.0222062). We hope the wider availability of tweet corpora can contribute to foster more studies of this kind.

The Presidential and Federal Records Act Amendments of 2014 expanded the PRA’s definition of records to include electronic content. According to Sarah Quigley, the chairperson of the Society of American Archivists’ Committee on Public Policy,

“it’s not the media that makes it a record, or the platform, it’s the content […] a presidential record is anything — any record, on any platform or in any media — created by the president or his office in the conduct of his business as president. And all of those records are considered permanent by the National Archives and the Presidential Records Act”

(NPR, October 25, 2019, 9:17 AM ET).

As Presidential Records, the data is of public interest (see also Twitter on public interest). Now that his account is suspended, it remains paramount researchers have timely access to the data required to continue studying this aspect and period of world’s history.

The original project including data collection obtained ethics approval in the Spring of 2020 as part of a now-completed PG research project by City, University of London CS REC.

The respository we have shared is a living resource and new files (such as charts and a tagged corpus) may be uploaded in due time; the materials have been shared now in the spirit of fostering timely open science and international collaboration. More information in the README file included in the repository.