Sala DSF 12:15 
Seminarium Instytutu

prof. dr hab. Stanisław Drożdż, IFJ PAN

Multiscale correlations in narrative texts

A language constitutes a great complexity as it for language is especially true that ”more is different”. Thus, the most natural linguistic constracts to study quantitative characteristics of the linguistic complexity are sentences and their mutual arrangement in texts. Studying in particular the sentence length variability (SLV) in a large corpus of world-famous literary texts shows that it involves a cascade-like alternation of various lengths sentences such that the power spectra S(f) of thus characterized SLV universally develop a convincing 1/f-type scaling with exponents close to what has been identified before in musical compositions or in the brain waves. An overwhelming majority of the studied texts simply obeys such fractal attributes but especially spectacular in this respect are hypertext-like, "stream of consciousness" novels. In addition, they appear to develop structures characteristic of irreducibly interwoven sets of fractals called multifractals which indicates that the related long-range correlations carry even a nonlinear component. This points to a distinct role of the full stops recurrence times along texts in inducing the long-range correlations. Treated as one extra word, the full stops at the same time appear to obey the Zipfian rank-frequency distribution, however. Furthermore, it appears that, from a statistical viewpoint, all the punctuation marks reveal properties that are qualitatively similar to the properties of the most frequent words.