https://doi.org/10.1140/epjb/e2005-00121-8
The variation of Zipf's law in human language
INFM udR Roma1, Dip. di Fisica. Università “La
Sapienza”. Piazzale A. Moro 5. 00185 Roma, Italy
Corresponding author: a ramon@pil.phys.uniroma1.it
Received:
20
August
2004
Published online:
20
April
2005
Words in humans follow the so-called Zipf's law. More precisely, the
word frequency spectrum follows a power function, whose typical
exponent is , but significant variations are found.
We hypothesize that the full range of variation reflects our ability to
balance the goal of communication, i.e. maximizing the information
transfer and the cost of communication, imposed by the limitations of
the human brain. We show that the higher the importance of satisfying the goal
of communication, the higher the exponent.
Here, assuming that words are used according to their meaning we
explain why
variation in β should be limited to a particular domain. From the
one hand, we explain a
non-trivial lower bound at about
for communication systems
neglecting the goal of the communication. From the other hand,
we find a sudden divergence of β if a certain critical balance
is crossed.
At the same time a sharp transition to maximum information transfer
and unfortunately, maximum communication cost, is found.
Consistently with the upper bound of real exponents, the maximum
finite value predicted is about
.
It is convenient for human language not to cross the transition
and remain in a domain where maximum information transfer is high but
at a reasonable cost.
Therefore, only a particular range of exponents should be found in human
speakers. The exponent β contains information about the balance
between cost and communicative efficiency.
PACS: 87.10.+e – General theory and mathematical aspects / 89.75.Da – Systems obeying scaling laws
© EDP Sciences, Società Italiana di Fisica, Springer-Verlag, 2005