https://doi.org/10.1140/epjb/e2006-00091-3
The network of concepts in written texts
1
Instituto de Física, Universidade Federal da Bahia, Campus Universitário da Federação, 40210-340, Salvador, BA, Brazil Instituto de Matemática, Universidade Federal da Bahia, Campus Universitário da Federação, 40210-340 Salvador, BA, Brazil
2
Institut Gaspard-Monge, Université de Marne-la-Vallée, 77454 Marne-la-Vallée Cedex 2, France
Corresponding author: a randrade@ufba.br
Received:
14
November
2005
Published online:
31
March
2006
Complex network theory is used to investigate the structure of meaningful concepts in written texts of individual authors. Networks have been constructed after a two phase filtering, where words with less meaning contents are eliminated and all remaining words are set to their canonical form, without any number, gender or time flexion. Each sentence in the text is added to the network as a clique. A large number of written texts have been scrutinised, and it is found that texts have small-world as well as scale-free structures. The growth process of these networks has also been investigated, and a universal evolution of network quantifiers have been found among the set of texts written by distinct authors. Further analyses, based on shuffling procedures taken either on the texts or on the constructed networks, provide hints on the role played by the word frequency and sentence length distributions to the network structure.
PACS: 89.75.Fb – Structures and organization in complex systems / 89.75.Hc – Networks and genealogical trees / 02.10.Ox – Combinatorics; graph theory
© EDP Sciences, Società Italiana di Fisica, Springer-Verlag, 2006