
Part 2 is here, part 3 here and the final, fourth part is here.
—
IFLA stands for The International Federation of Library Associations and Institutions.
The IFLA World Library and Information Congress 2016 and 2nd IFLA General Conference and Assembly, ‘Connections. Collaboration. Community’ is currently taking place (13–19 August 2016) at the Greater Columbus Convention Center (GCCC) in Columbus, Ohio, United States.
The official hashtag of the conference is #WLIC2016. Earlier, I shared a searchable, live archive of the hashtag here. (Page may be slow to load depending on bandwidth).
I have looked at the text from 4,945 Tweets published with #WLIC2016 from 14/08/2016 to 15/08/2016 11:16:06 (EDT, Columbus Ohio time). Only accounts with at least 1 follower were included. I collected them with Martin Hawksey’s TAGS.
According to Voyant Tools this corpus had 82,809 total words and 7,506 unique word forms.
I applied an English stop word list which I edited to include Twitter-specific terms (https, t.co, amp (&) etc.), proper names (Barack Obama, other personal usernames) and some French stop words (mainly personal pronouns). I also edited the stop word list to include some dataset-specific terms such as the conference hashtag and other common hashtags, ‘ifla’, etc. (I left others that could also be considered dataset-specific terms, such as ‘session’ though).
The result was a listing of of 800 frequent terms (the least frequent terms in the list had been repeated 5 times). I then cleaned the data from any dataset-specific stop words that the stop word list did not filter and created an edited ordered listing of the most frequent 50 terms. I left in organisations’ Twitter user names (including @potus), as well as other terms that may not seem that meaningful on their own (but who knows, they may be).
It must be taken into account the corpus included Retweets; each RT counted as a single Tweet, even if that meant terms were being logically repeated. This means that term counts in the list reflect the fact the dataset contains Retweets (which obviously implies the repetition of text).
If for some reason you are curious about what the most frequent words in #WLIC2016 Tweets were during this initial period (see above), here’s the top 50:
Term | Count |
libraries |
543 |
copyright |
517 |
librarians |
484 |
library |
406 |
session |
374 |
world |
326 |
message |
271 |
opening |
249 |
access |
226 |
make |
204 |
digital |
195 |
internet |
162 |
future |
161 |
information |
157 |
new |
146 |
use |
141 |
people |
138 |
president |
131 |
potus |
125 |
literacy |
118 |
need |
117 |
oclc |
114 |
ceremony |
113 |
dpla |
109 |
poster |
105 |
thanks |
103 |
collections |
102 |
public |
100 |
delegates |
99 |
cilipinfo |
98 |
countries |
95 |
iflatrends |
95 |
93 |
|
shaping |
91 |
work |
89 |
drag |
83 |
report |
83 |
create |
81 |
open |
81 |
data |
79 |
content |
78 |
learn |
78 |
latest |
77 |
making |
77 |
fight |
76 |
ifla_arl |
75 |
read |
74 |
info |
73 |
exceptions |
69 |
great |
68 |
So for what it’s worth those were the 5o most frequent terms in the corpus.
I, for one, not being present in the Congress, found it interesting that ‘copyright’ is the second most frequent term, following ‘libraries’. One notices also that, unsurprisingly, the listing of top most frequent terms includes some key terms (such as ‘access’, ‘internet’, ‘digital’, ‘open’, ‘data’) concerning Library and Information professionals of late.
Were these the terms you’d have expected to make a ‘top 50’ in almost 5,000 Tweets from this initial phase of this particular conference?
The conference hasn’t finished yet of course. But so far, for a libraries and information world congress, which terms would you say are noticeable by their absence in this list? ;-)
—
Part 2 is here, part 3 here and the final, fourth part is here.
3 thoughts on “What Library Folk Live Tweet About: Most Frequent Terms in #WLIC2016 Tweets”
Comments are closed.