BBC Politics Election Tweets: A Quick Text Analysis

I collected 100 tweets from the official @BBCPolitics Twitter account posted between 26/05/2014 00:14:31 and  26/05/2014 12:59:10 BST. I collected the tweets using Martin Hawksey‘s TAGS.

I copied the text of the tweets and ran a basic text analysis using Voyant Tools by Stéfan Sinclair & Geoffrey Rockwell. I customised the English ‘Taporware’ stop word list to include reporting-specific terms (such as ‘says’–this should be further refined, as I accidentally left ‘declared’) and Twitter-specific terms likely to be over-represented, like ‘http’, ‘rt’ and ‘t.co’.  (Some shortened URLs remained).  I left the hashtags ‘#EP2014’ and ‘#vote2014’ in the corpus on purpose.

There is 1 document in this corpus with a total of 1,956 words and 695 unique words.

If needed, click on image to enlarge.

Cirrus cloud visualising most frequent terms in corpus of 100 tweets from the official @BBCPolitics Twitter account posted between 26/05/2014 00:14:31 and  26/05/2014 12:59:10 BST.  Cloud CC-BY Ernesto Priego. Created with Voyant Tools by Stéfan Sinclair & Geoffrey Rockwell (©2014). Source data and more info at epriego.wordpress.com
Cirrus cloud visualising most frequent terms in corpus of 100 tweets from @BBCPolitics posted between 26/05/2014 00:14:31 and 26/05/2014 12:59:10 BST.

Words in the Entire Corpus
Corpus Term Frequencies provides an ordered list for all the terms’ frequencies appearing in a corpus. The first column indicates the keyword in order of frequency; the second column the number of times it appears in the corpus. The other columns can be toggled to show other statistical information, including a small line graph for term frequency across the corpus.

Words in the Entire Corpus. Corpus Term Frequencies provides an ordered list for all the terms’ frequencies appearing in a corpus. As well additional columns can be toggled to show other statistical information, including a small line graph for term frequency across the corpus. Created with Voyant Tools by Stéfan Sinclair & Geoffrey Rockwell (©2014).
#vote2014 32 5.10 172.3 0.000
ukip 18 2.66 96.9 0.000
election 17 2.49 91.5 0.000
lib 17 2.49 91.5 0.000
elections 15 2.14 80.8 0.000
european 15 2.14 80.8 0.000
results 15 2.14 80.8 0.000
#ep2014 13 1.79 70.0 0.000
vote 13 1.79 70.0 0.000
@bbcr4today 11 1.44 59.2 0.000
lab 11 1.44 59.2 0.000
party 11 1.44 59.2 0.000
green 10 1.27 53.9 0.000
@chrismasonbbc 9 1.09 48.5 0.000
dem 9 1.09 48.5 0.000
eu 9 1.09 48.5 0.000
farage 9 1.09 48.5 0.000
labour 9 1.09 48.5 0.000
meps 9 1.09 48.5 0.000
result 9 1.09 48.5 0.000
uk 9 1.09 48.5 0.000
@bbcbreaking 8 0.92 43.1 0.000
david 8 0.92 43.1 0.000
dems 8 0.92 43.1 0.000
far 8 0.92 43.1 0.000
seat 8 0.92 43.1 0.000
#r4today 7 0.74 37.7 0.000
@bbcnormans 7 0.74 37.7 0.000
cameron 7 0.74 37.7 0.000
coverage 7 0.74 37.7 0.000
london 7 0.74 37.7 0.000
snp 7 0.74 37.7 0.000
@rebeccakeating 6 0.57 32.3 0.000
scotland 6 0.57 32.3 0.000
votes 6 0.57 32.3 0.000
euro 5 0.39 26.9 0.000
new 5 0.39 26.9 0.000
nick 5 0.39 26.9 0.000
parties 5 0.39 26.9 0.000
people 5 0.39 26.9 0.000
pm 5 0.39 26.9 0.000
seats 5 0.39 26.9 0.000
video 5 0.39 26.9 0.000
big 4 0.22 21.5 0.000
bnp 4 0.22 21.5 0.000
clegg 4 0.22 21.5 0.000
declared 4 0.22 21.5 0.000

 

I also collected 49 tweets posted by @bbcnickrobinson between 18/05/2014 21:21:34 and 26/05/2014 02:34:07 BST. I followed the same procedure as above, producing the following Cirrus cloud (if needed, click on image to enlarge) and frequency list.

There is 1 document in this corpus with a total of 946 words and 458 unique words.

Cirrus cloud visualising most frequent terms in corpus of 49 tweets from @bbcknickrobinson posted between 18/05/2014 21:21:34 and 26/05/2014 02:34:07 BST.
Cirrus cloud visualising most frequent terms in corpus of 49 tweets from @bbcknickrobinson posted between 18/05/2014 21:21:34 and 26/05/2014 02:34:07 BST.

Words in the Entire Corpus
Corpus Term Frequencies provides an ordered list for all the terms’ frequencies appearing in a corpus. The first column indicates the keyword in order of frequency; the second column the number of itmes it appears in the corpus. The other columns can be toggled to show other statistical information, including a small line graph for term frequency across the corpus.

Words in the Entire Corpus. Corpus Term Frequencies provides an ordered list for all the terms’ frequencies appearing in a corpus. As well additional columns can be toggled to show other statistical information, including a small line graph for term frequency across the corpus. Created with Voyant Tools by Stéfan Sinclair & Geoffrey Rockwell (©2014).
farage 10 2.71 106.0 0.000
ukip 9 2.36 95.4 0.000
blog 7 1.68 74.2 0.000
vote 7 1.68 74.2 0.000
lib 6 1.34 63.6 0.000
@bbcpolitics 5 1.00 53.0 0.000
clegg 5 1.00 53.0 0.000
election 5 1.00 53.0 0.000
nigel 5 1.00 53.0 0.000
night 5 1.00 53.0 0.000
power 5 1.00 53.0 0.000
@bbcnickrobinson 4 0.66 42.4 0.000
@nick 4 0.66 42.4 0.000
dem 4 0.66 42.4 0.000
european 4 0.66 42.4 0.000
just 4 0.66 42.4 0.000
morning 4 0.66 42.4 0.000
romanians 4 0.66 42.4 0.000
#ep2014 3 0.32 31.8 0.000
#vote2014 3 0.32 31.8 0.000
david 3 0.32 31.8 0.000
dimbleby 3 0.32 31.8 0.000
elections 3 0.32 31.8 0.000
got 3 0.32 31.8 0.000
interview 3 0.32 31.8 0.000
know 3 0.32 31.8 0.000
labour 3 0.32 31.8 0.000
millwall 3 0.32 31.8 0.000
poll 3 0.32 31.8 0.000
says 3 0.32 31.8 0.000
send 3 0.32 31.8 0.000
tories 3 0.32 31.8 0.000
uk 3 0.32 31.8 0.000
win 3 0.32 31.8 0.000
words 3 0.32 31.8 0.000
@bbcnews 2 -0.02 21.2 0.000
@nigel 2 -0.02 21.2 0.000
@thelawyercatrin 2 -0.02 21.2 0.000
answer 2 -0.02 21.2 0.000
band 2 -0.02 21.2 0.000
beaming 2 -0.02 21.2 0.000
capital 2 -0.02 21.2 0.000
completely 2 -0.02 21.2 0.000
coverage 2 -0.02 21.2 0.000
day 2 -0.02 21.2 0.000
dems 2 -0.02 21.2 0.000
didn’t 2 -0.02 21.2 0.000
doing 2 -0.02 21.2 0.000
ed 2 -0.02 21.2 0.000
europe 2 -0.02 21.2 0.000

It is significant that in these two small corpora from the two major BBC Politics Twitter accounts the top results had some clear coincidences. It’s up to the reader to draw conclusions. I have uploaded the source data to figshare:

Priego, Ernesto (2014): Corpora of 100 Tweets from BBCPolitics and 49 Tweets from bbcnickrobinson in context of European Election Results 2014. figshare.
http://dx.doi.org/10.6084/m9.figshare.1036647