“Stronger In”: Looking Into a Sample Archive of 1,005 StrongerIn Tweets

If you haven’t been there already, please start here. An introduction and a detailed methodological note provide context to this post.

I have now shared a spreadsheet containing an archive of 1,005 @StrongerIn Tweets publicly published by the queried account between12/06/2016 13:34:35 and 21/06/2016 13:11:34 BST.

The spreadsheet contains four more sheets containing a data summary from the archive, a table of tweets’ sources, and tables of corpus term and trend counts and collocate counts.

This will hopefully allow to compare two similar samples from the output of two homologous Twitter accounts, both officially representing the ‘Leave’ and ‘Remain’ sides of the UK EU Referendum. The collected period is the same and if desired it is possible to edit the sets to have for example 1,000 Tweets each.

Following the structrue of my previous post on the ‘Vote Leave‘ dataset, here’s some quick insights from the @StrongerIn account for comparison.

Archive (from:StrongerIn)

Number of links	735
Number of RTs	409	<-estimate based on occurrence of RT
Number of Tweets	1005
Unique tweets	1004	<-used to monitor quality of archive
First Tweet in Archive	12/06/2016 13:34:35	BST
Last Tweet in Archive	21/06/2016 13:11:34	BST
In Reply Ids	9
In Reply @s	0
Tweet rate (tw/min)	0.1	Tweets/min (from last archive 10mins)

Like the @vote_leave account, @StrongerIn is used for mainly broadcasting Tweets and no @ Replies to users were collected during the period represented in the dataset.

Though this dataset, collected over slightly different timings but covering the same number of days, contains 60 fewer Tweets than the Vote Leave one; this @StrongerIn dataset reflects the account shared 235 links more than its @vote_leave counterpart.

Sources

Unlike @vote_leave, the dataset does not indicate that @StrongerIn uses Buffer nor Twitter for iPhone. However TweetDeck (423) and the Twitter Web Client (591) appear as the main sources. There’s even an interestingly strange Tweet, linking to a StrongerIn 404 web site page, published from NationBuilder.

Source	Count
Nationbuilder	1
TweetDeck	413
Twitter Web Client	591
Total	1,005

Most Frequent Words

Removing Twitter data-specific stopwords from the raw data (e.g. t.co, amp, rt) the 10 most frequent words in the corpus are:

Term	Count	Trend
eu	287	0.013906387
remain	224	0.010853765
bbcqt	216	0.01046613
europe	209	0.01012695
vote	170	0.008237232
strongerin	167	0.00809187
uk	159	0.0077042347
jobs	148	0.0071712374
leave	148	0.0071712374
eudebate	113	0.0054753367

Compare them with the 10 most frequent words in the vote_leave data. Anything interesting?

Let’s compare the top 10 terms from each account side by side:

Top 10 Terms in 1,100 vote_leave Tweets over 7 days	vote_leave count	Top 10 Terms in 1,005 StrongerIn Tweets over 7 days	StrongerIn count
voteleave	558	eu	287
eu	402	remain	224
bbcqt	398	bbcqt	216
gove	165	europe	209
takecontrol	146	vote	170
immigration	133	strongerin	167
control	95	uk	159
cameron	89	jobs	148
turkey	84	leave	148
uk	72	eudebate	113

The terms in red are those appearing in both datsets; the terms in blue correspond to the name of each campaign. It’s interesting that though the StrongerIn account has 182 fewer mentions of ‘bbcqt’ (bear in mind the StrongerIn dataset has 95 fewer Tweets), ‘bbqt’ remains in third place on both sets.

The differences between the ranking of mentions of each campaign’s name are noticeable; as well as the fact that the vote_leave campaign has the name of the Prime Minister (himself a Remain campaigner) in its top 10 (as well as that of Gove; a Leave campaigner), while StrongerIn has no names of politicians on its 10 most frequent words.

There are other potentially interesting or noticeable differences when we compare these two top 10s. Can you spot them? Do they tell us anything or not?

Digging into data and creating datasets does not necessarily tell us new things, but it does allow us to pinpoint otherwise moving objects. We don’t need to pin butterflies to recognise they are indeed butterflies, but the intention is to create new settings for observation.

References

González-Bailón, S., Banchs, R.E. and Kaltenbrunner, A. (2012) Emotions, Public Opinion and U.S. Presidential Approval Rates: A 5 Year Analysis of Online Political Discussions. Human Communication Research 38 (2) 121-143.

González-Bailón, S. et al (2012) Assessing the Bias in Communication Networks Sampled from Twitter (December 4, 2012). DOI: http://dx.doi.org/10.2139/ssrn.2185134

Hawksey, M. (2013) What the little birdy tells me: Twitter in education. Published on November 12, 2013. Presentation given from the LSE NetworkED Seminar Series 2013 on the use of Twitter in Education. Available from http://www.slideshare.net/mhawksey/what-the-little-birdy-tells-me-twitter-in-education [Accessed 21 June 2016].

Priego, E. (2016) “Vote Leave” Looking Into a Sample Archive of 1,100 vote_leave Tweets. 21 June 2016. Available from https://epriego.wordpress.com/2016/06/21/vote-leave-looking-into-a-sample-archive-of-1100-vote_leave-tweets/. [Accessed 21 June 2016].

Priego, E. (2016) “Vote Leave” A Dataset of 1,100 Tweets by vote_leave with Archive Summary, Sources and Corpus Terms and Collocates Counts and Trends. figshare. URL: DOI: https://dx.doi.org/10.6084/m9.figshare.3452834.v1

Priego, E. (2016) “Stronger In” A Dataset of 1,005 Tweets by StrongerIn with Archive Summary, Sources and Corpus Terms and Collocates Counts and Trends. figshare. DOI:
https://dx.doi.org/10.6084/m9.figshare.3456617.v1

[StrongerIn]. (2016). [Twitter account].Retrieved from https://twitter.com/StrongerIn. [Accessed 21 June 2016].

[vote_leave]. (2016) [Twitter account]. Retrieved from https://twitter.com/vote_leave. [Accessed 21 June 2016].

Related

Published by Ernesto Priego

2 thoughts on ““Stronger In”: Looking Into a Sample Archive of 1,005 StrongerIn Tweets”