Although it was born long before computer science and big data, quantitative linguistics is still in a kind of limbo between the sciences and technology, on the one hand, and the humanities, on the other. This is due to the lack of awareness of the link between science and literature that quantitative linguistics has tried to consolidate since the beginning of the 20th century.

Quantitative Linguistics. The statistics of the words.

Then, the German professor at Harvard University, George Kingsley Zipf, was working on consolidating the study of language into an exact science, as the case of physics. To do this, he proposed theories known as Zipfiian laws; the first of them relates the frequency of words and their rank or order of appearance.

While the second is based on the principle of the law of least effort. “Zipf predicted, based on this principle, what forces should govern the use of the words of a vocabulary based on the interests of the speaker and the listener” write Toni Hernández and Ramón Ferrer i Cancho in their book Quantitative Linguistics. Word statistics.

Then it would be the turn of the mathematical theory of information that the American mathematician and engineer Claude Shannon (1916-2001) formalized in 1948 and complemented a year later with the contributions of Warren Weaver (1894-1978), giving a mathematical-physical framework to communication systems.

Herdan’s and Menzerath-Altmann’s are other laws that have been shown that linguistics feeds internally on linguistic typology, phonetics and computational linguistics; and externally, from information theory, cognitive science (psychology) and information science or documentation.

Finally, the authors of this book propose as one of the challenges of this discipline, to explore linguistic processing and understanding through ontology, semantics, pedagogy and neurological and physiological correlates alluding to language and, in the future, chemical communication and genomic linguistics.

