https://doi.org/10.1140/epjb/e2004-00114-1
Correlated topologies in citation networks and the Web
School of Informatics and Departments of Computer Science and Physics,
Indiana University, Bloomington, IN 47408, USA
Corresponding author: a fil@indiana.edu
Received:
5
November
2003
Revised:
26
February
2004
Published online:
14
May
2004
Information networks such as the scientific literature and the Web have been studied extensively by different communities focusing on alternative topological properties induced by citation links, textual content, and semantic relationships. This paper reviews work that brings such different perspectives together in order to build better search tools and to understand how the Web's scale free topology emerges from author behavior. I describe three topologies induced by different classes of similarity measures, and outline empirical data that allows us to quantify and map their correlations. The data is also used to study a power law relationship between the content similarity between two documents and the probability that they are connected by citations or hyperlinks. Such finding has led to a remarkably powerful growth model for information networks, which simultaneously predicts the distribution of degree and the distribution of content similarity across pairs of documents — Web pages connected by links and scientific articles connected by citations.
PACS: 89.20.Hh – World Wide Web, Internet / 89.75.-k – Complex systems
© EDP Sciences, Società Italiana di Fisica, Springer-Verlag, 2004